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Preface 


In 1990 we introduced a one-semester applications of algebra course at 
North Carolina State University for students who had successfully com- 
pleted semesters of linear and abstract algebra. We intended for the course 
to give students more exposure to basic algebraic concepts, and to show 
students some practical uses of these concepts. The course was received 
enthusiastically by both students and faculty and has become one of the 
most popular mathematics electives at NC State. 


When we were originally deciding on material for the course, we knew 
that we wanted to include several topics from coding theory, cryptography, 
and counting (what we call Polya theory). With this in mind, at the sug- 
gestion of Michael Singer, we used George Mackiw’s book Applications of 
Abstract Algebra for the first few years, and supplemented as we saw fit. 
After several years, Mackiw’s book went out of print temporarily. Rather 
than search for a new book for the course, we decided to write our own notes 
and teach the course from a coursepack. About the same time, NC State 
incorporated the mathematics software package Maple V?™! into its calcu- 
lus sequence, and we decided to incorporate it into our course as well. The 
use of Maple played a central role in the recent development of the course 
because it provides a way for students to see realistic examples of the topics 
discussed without having to struggle with extensive computations. With 
additional notes regarding the use of Maple in the course, our coursepack 
evolved into this book. In addition to the topics discussed in this book, we 
have included a number of other topics in the course. However, the present 
material has become the constant core for the course. 


Our philosophy concerning the use of technology in the course is that 
it be a useful tool and not present new problems or frustrations. Conse- 
quently, we have included very detailed instructions regarding the use of 


1Maple V is a registered trademark of Waterloo Maple, Inc., 57 Erb St. W, Waterloo, 
Canada N2L6C2, www.maplesoft.com. 
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Maple in this book. It is our hope that the Maple discussions are thorough 
enough to allow it to be used without much alternative aid. As alterna- 
tive aids, we have included a basic Maple tutorial in Appendix A, and an 
introduction to some of Maple’s linear algebra commands in Appendix B. 
Although we do not require students to produce the Maple code used in 
the course, we do require that they obtain a level of proficiency such that 
they can make basic changes to provided worksheets to complete numerous 
Maple exercises. So that this book can be used for applications of algebra 
courses in which Maple is not incorporated, we have separated all Maple 
material into sections that are clearly labeled, and separated all Maple and 
non-Maple exercises. 


When teaching the course, we discuss the material in Chapter 1 as 
needed rather than review it all at once. More specifically, we discuss the 
material in Chapter 1 through examples the first time it is needed in the ap- 
plications that follow. Some of the material in Chapter 1 is review material 
that does not apply specifically to the applications that follow. However, 
for students with weak backgrounds, Chapter 1 provides a comprehensive 
review of all necessary prerequisite mathematics. 


Chapter 2 is a short chapter on block designs. In Chapters 3, 4, and 
5 we discuss some topics from coding theory. In Chapter 3 we introduce 
error-correcting codes, and present Hadamard, Reed-Muller, and Hamming 
codes. In Chapters 4 and 5, we present BCH codes and Reed-Solomon 
codes. Each of these chapters are dependent in part on the preceding chap- 
ters. The dependency of Chapter 3 on Chapter 2 can be avoided by omitting 
Sections 3.2, 3.3, and 3.4 on Hadamard and Reed-Muller codes. In Chap- 
ters 6, 7, and 8 we discuss some topics from cryptography. In Chapter 6 
we introduce algebraic cryptography, and present several variations of the 
Hill cryptosystem. In Chapter 7 we present the RSA cryptosystem and 
discuss some related topics, including the Diffie-Hellman key exchange. In 
Chapter 8 we present the ElGamal cryptosystem, and describe how elliptic 
curves can be incorporated into the system naturally. There is a slight de- 
pendency of Chapters 7 and 8 on Chapter 6, and of Chapter 8 on Chapter 
7. Chapter 9 is a stand-alone chapter in which we discuss the Polya count- 
ing techniques, including Burnside’s Theorem and the Polya Enumeration 
Theorem. 


We wish to thank all those who have been involved in the develop- 
ment of this course and book. Pete Hardy taught from the coursepack and 
improved it with his suggestions. Also, Michael Singer suggested various 
topics and wrote notes on some of them. Many students have written on 
this material for various projects. Of these, the recent master’s project by 
Karen Klein on elliptic curves was especially interesting. Finally, we wish to 
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thank our mentor, Jack Levine, for his interest in our projects, his guidance 
as we learned about applications of algebra, and his many contributions to 
the subject, especially in cryptography. 
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Chapter 1 


Preliminary Mathematics 


There are two purposes to this chapter. We very quickly and concisely re- 
view some of the basic algebraic concepts that are probably familiar to many 
readers, and also introduce some topics for specific use in later chapters. 
We will generally not pursue topics any further than is necessary to obtain 
the material needed for the applications that follow. Topics discussed in 
this chapter include permutation groups, the ring of integers, polynomial 
rings, finite fields, and examples that incorporate these topics using the 
philosophies of concepts covered in later chapters. 


1.1 Permutation Groups 


Suppose a set G is closed under an operation *. That is, suppose axb € G 
for all a,b € G. Then x is called a binary operation on G. We will use the 
notation (G,*) to represent the set G with this operation. Suppose (G, *) 
also satisfies the following three properties. 


1. (ax b) *c = a x (bxc) for all a,b,c € G. 


2. There exists an identity element e € G for which exa=ax*xe =a for 
alla €G. 


3. For each a € G, there exists an inverse element b € G for which 
axb=bxa= e. The inverse of a is usually denoted a~! or —a. 


Then (G, *) is called a group. For example, it can easily be verified that for 
the set Z of integers, (Z, +) is a group with identity element 0. 
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Let S be a set, and let A(S) be the set of bijections on S. Then an 
element a € A(S) can be uniquely expressed by its action (s)a on the 
elements s € S. 


Example 1.1 If S = {1,2,3}, then A(S) contains six elements. One of 
the a in A(S) can be expressed as (1)a = 2, (2)a = 3, and (3)a=1. E 


Let o represent the composition operation on A(S). Specifically, if 
a, 6 € A(S), then define ao 8 by the action (s)(a o 8) = ((s)a)8 for s € S. 
Since the composition of two bijections on S' is also a bijection on S, then 
aoe A(S). Hence, o is a binary operation on A(S). It can easily be 
verified that (A(S'),0) is a group (see Written Exercise 1). 


A group (G, *) is said to be abelian or commutative if axb = bxa for all 
a,b € G. For example, since m+n = n+ m for all m,n € Z, then (Z, +) is 
abelian. However, for a set S with more than two elements, ao Æ Goa for 
some a, 3 € A(S). Therefore, if a set S contains more than two elements, 
then (A($),0) is not abelian. 


We will represent the number of elements in a set S by |S|. Suppose 
S is a set with |S| = n. Then (A(S),0) is denoted by Sn and called the 
symmetric group on n letters. It can easily be shown that |Sn| = n! (see 
Written Exercise 2). Suppose a € Sn. Then a can be viewed as a bijection 
on the set {1,2,...,n}. This bijection can be represented by listing the 


elements in the set {1,2,...,n} in a row with their images under a listed 
immediately below. 
2 n 
a 
(ja  (2)a (n)a 


Example 1.2 Let S = {1, 2, 3}, and let a € S3 be given by (l)a = 2, 
(2)a = 3, and (3)a = 1. Then a can be represented as follows. 


[123 
alan a I aa 
E 


An element a € Sn is called a permutation. Note that for permutations 
a, B € Sn, we can represent ao ( as follows. 


( 1 i n )( 1 igh n J=( 1 a n ) 
(Na --- (n)a (IB = (MB) NG > (na) 
For example, let a € S4 be given by (1)a = 2, (2)a = 4, (3)a = 3, and 
(4)a = 1, and let 8 € S4 be given by (1)8 = 4, (2)6 = 3 
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(4)3 = 1. Then we can express a o ĝ as follows. 
123 4 123 4\ f(12 3 4 
243 1 4321) \3 12 4 
We now discuss another way to express elements in Sn. Let 71, 72,...,75 
be distinct elements in the set S = {1,2,...,n}. Then (i1 ig 73 +++ is—1 is) 
is called a cycle of length s or an s-cycle, and represents the permutation 


a E€ Sn that maps 71 + iz, i2  713,...,%5-1 Œ ts, 1g > i1, and every other 
element in S' to itself. For example, the permutation 


[123456 
%: (345162 
in Sg can be expressed as the 6-cycle (135624). Note that this expression 


of a as a cycle is not unique, for a can also be expressed as (356241) and 
(562413), among others. 


Next, consider the permutation 
B: 1 2 3 4 5 6 
' 3 456 12 
in Sg. To express @ using cycle notation, we must use more than one cycle. 
For example, we can express (7 as the following “product” of two 3-cycles: 
(135)(246). Since these cycles contain no elements in common they are 
said to be disjoint. And because they are disjoint, the order in which they 


are listed does not matter. The permutation @ can also be expressed as 
(246) (135). 


Every permutation in Sn can be expressed as either a single cycle or a 
product of disjoint cycles. When a permutation is expressed as a product of 
disjoint cycles, cycles of length one are not usually included. For example, 
consider the permutation 


in Sg. Even though the fact that y maps 6 to itself would be expressed as 
the 1-cycle (6), this cycle would not usually be included in the expression 
of y as a product of disjoint cycles. That is, y would usually be expressed 
as (135)(24) or (24)(135). 


In an expression of a permutation as a product of cycles, the cycles 
need not be disjoint. For example, the permutation a = (135624) defined 
above can also be expressed as the product (13)(15)(16)(12)(14) of 2-cycles. 
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Because these 2-cycles are not disjoint, the order in which they are listed 
matters. 


A 2-cycle is also called a transposition. Any permutation can be ex- 
pressed as a product of transpositions in the way illustrated above for a. 
Specifically, the cycle (i1 i2 i3 -++ is—1 is) can be expressed as the product 
(i1 i2)(i1 i3) +++ (41 is—1)(i1 is) of transpositions. If a permutation can be 
expressed as a product of more than one disjoint cycle, then each cycle can 
be considered separately when expressing the permutation as a product of 
transpositions. For example, the permutation 3 = (135)(246) defined above 
can be expressed as (13)(15)(24)(26), and the permutation y = (135)(24) 
defined above can be expressed as (13)(15)(24). 


There are many ways to express a permutation as a product of trans- 
positions, and the number of transpositions in these expressions may vary. 
However, the number of transpositions in the expression of a permutation 
as a product of transpositions is either always even or always odd. A per- 
mutation is said to be even if it can be expressed as a product of an even 
number of transpositions, and odd if it can be expressed as a product of an 
odd number of transpositions. Thus, the product of two even permutations 
is even, and the product of two odd permutations is also even. 


The inverse of the cycle (ii i2 i3 +++ is—1 ts) is (is is-1 +++ i3 i2 i1). 
Suppose @ = a Q@2---a~ E Sn, where each a; is a transposition. Then 
aT! = Oy, pee "Q2 tag — Ap +++ Q2Q1 since a; + = a; for each transposition 


Qi. Hence, the inverse of an even permutation is even. And because the 
identity permutation is even, the subset of even permutations in S,, forms a 
group. This group is denoted by A, and called the alternating group on n 
letters. Since A, is a subset of S» and forms a group, we call A, a subgroup 
of Sn. 


Definition 1.1 Let (G,*) be a group, and suppose H is a nonempty subset 
of G. If (H,*) is a group, then H is called a subgroup of G. 


Consider a regular polygon P, such as, for example, an equilateral 
triangle or a square. Any movement of P that preserves the general shape of 
P is called a rigid motion. There are two types of rigid motions — rotations 
and reflections. For a regular polygon P with n sides, there are 2n distinct 
rigid motions. These include the n rotations of P through 360j/n degrees 
for j =1,...,n. The remaining n rigid motions are reflections. If n is even, 
these are the reflections of P across the lines that connect opposite vertices 
or bisect opposite sides of P. If n is odd, these are the reflections of P 
across the lines that are perpendicular bisectors of the sides of P. Since 
the rigid motions of P preserve the general shape of P, they can be viewed 
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as permutations of the vertices or sides of P. The set of rigid motions of a 
regular polygon P forms a group called the symmetries of P. 


Example 1.3 Consider the group of symmetries of a square. To express 
these symmetries as permutations of the vertices of a square, consider the 


following general figure. 


2 3 


The 8 symmetries of a square can be expressed as permutations of the 
vertices of this general figure as follows (rotations are counterclockwise). 


Rigid Motion Permutation 

90° rotation (1234) 

180° rotation (13) (24) 

270° rotation (1432) 

360° rotation identity 

reflection across horizontal (12)(34) 

reflection across vertical (14)(23) 
reflection across 1-3 diagonal (24) 
reflection across 2—4 diagonal (13) 


Note that expressing these rigid motions as permutations on the vertices of 
the preceding general figure yields a subgroup of S4. E 


When the symmetries of an n-sided regular polygon are expressed as 
permutations on the set {1,2,...,n}, the resulting subgroup of S, is de- 
noted by Dn and called the dihedral group on n letters. The subgroup of 
S4 in Example 1.3 is the dihedral group D4. 


A group (G,-), or just G for short, is called cyclic if there is an element 
a € G for which G = {at | i € Z}. In this case, a is called a cyclic generator 
for G. More generally, suppose a is an element in a group G, and let 
H = {at |i € Z}. Then H is a subgroup of G called the cyclic group 
generated by a. Let at = af for some 0 <i < j. Then ai? = afa™t =e, 
where e is the identity element in G. Thus, there is a smallest positive 
integer m for which a™ = e. Now, suppose at = e. Since t = mq +r 
for some 0 < r < m, and a! = a™I+" = (a™)4a" = a", it follows that 
r = 0. Hence, m divides t. Since a’ = af for i < j forces aJ~* = e, a 
contradiction if 0 < j —i < m, the set {af | 0 < i < m} consists of m 
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distinct elements. Furthermore, for any integer k we can write k = mq +r 
for some 0 < r < m with aë = a". Therefore, H = {af | 0 < i < m}, 
and H contains m elements. We summarize this discussion as the following 
theorem. 


Theorem 1.2 Suppose a is an element in a group G. If m is the smallest 
positive integer for which a™ = e, where e is the identity element in G, 
then the cyclic group generated by a contains m elements. 


The value of m in Theorem 1.2 is called the order of a. Also, a set 
S with |S| = n is said to have order n. Hence, the order of an element a 
in a group G is the order of the cyclic subgroup of G generated by a. We 
will show in Theorem 1.4 that for an element of order m in a group G of 
order n, m must divide n. Therefore, in a group G of order n, a” = e for 
all a € G where e is the identity element in G. We summarize this as the 
following corollary. 


Corollary 1.3 Suppose a is an element in a group G of order n. Then 
a” =e where e is the identity element in G. 


Example 1.4 Consider the dihedral group D,, of order 2n. Recall that 
the elements in D, can be viewed as the symmetries of an n-sided regular 
polygon P. Each of the n reflections of P has order 2. Also, the rotations 
of P through 360/n and 360(n—1)/n degrees have order n (as do, possibly, 
some other rotations). Note that these orders divide |D,,]. | 


1.2 Cosets and Quotient Groups 


Let H be a subgroup of a group G. For an element g € G, we define 
gH = {gh | h € H}, called a left coset of H in G. Since gh; = ghz implies 
hy = he for all hj,ho € H, then there is a one-to-one correspondence 
between the elements in gH and H. Thus, if H is finite, |gH| = |H]. 
Suppose gi,g2 E€ G. If x € gq. HM g2H for some x € G, then x = gyh, = 
gah for some hı,h2 € H. Hence, gı = gehzh;* € g2H. Then for any 
y € mH, it follows that y = gih3 = gahah{'h3 € g2H for some hs € H. 
Therefore, gH C goH. Similarly, gH C gH, so gH = goH. The 
preceding arguments imply that if g1,g2 E€ G, then either gı H = goH, or 
gıH and g2H are disjoint. Hence, G is the union of pairwise disjoint left 
cosets of H in G. 
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Example 1.5 Consider the subgroup An of Sn. If œ is an odd permutation 
in Sn, then aA, and A, are disjoint. If G is any other odd permutation 
in Sn, then 87ta will be even. Therefore, 8~!a € An, and aAn = BAn. 
Hence, there are two left cosets of A, in Sn, one consisting of the even 
permutations in Sn, and the other consisting of the odd permutations. W 


For a finite group G with subgroup H, the following theorem is a 
fundamental algebraic result regarding the number of left cosets of H in G. 
This theorem is called Lagrange’s Theorem. 


Theorem 1.4 Let G be a group of order n with subgroup H of order k, 
and suppose there are t distinct left cosets of H in G. Then n = kt. 


Proof. Each of the t distinct left cosets of H in G contains k elements. 
Since G is the union of these left cosets, then n = kt. ] 


As a consequence of Lagrange’s Theorem, the order of a subgroup H in 
a finite group G must divide the order of G. For example, the dihedral group 
D, of permutations in Example 1.3 has order 8, which divides |S4| = 24. 


We began this section by defining the left cosets gH of a subgroup H 
in a group G. Results analogous to those discussed so far in this section 
also hold for the sets Hg = {hg | h € H}, called right cosets of H in G. 


Next, we discuss how cosets can be used to construct new groups from 
known ones. Suppose H is a subgroup of a group G. Then for x € G, 
let x` !Hg = {athe | h € H}. If aH C A for all x € G, then H is 
called a normal subgroup of G. As we will show, if H is a normal subgroup 
of a group G, then the set of left cosets of H in G forms a group with 
the operation (v«H)(yH) = («y)H. To see this, note first that since H is 
normal in G, then «~!Ha C H for all x € G. Specifically, this will be true 
if we replace x with x71. That is, (x&~1)-'Ha-! = xHa-! C H. Thus, 
for any h € H, it follows that h = x~\(xha-')x = athe € x'Hzx for 
some hı € H. Hence, H C 2~!Hz, and since H is normal in G, then 
«x 'Hx = H. Therefore, a subgroup H in a group G satisfies 2~!'Ha = H 
if and only if H is normal in G. 


To see that the operation defined above for the left cosets of H in G 
is well-defined, let cH = 2,H and yH = yı H for some z,21,y,y1 E G. 
Since cH = xH and yH = yH, then x = zıhı and y = yıho for some 
hi, ho € H. And since H is normal in G, then yy hiyi = ha for some 
h3 € H, or, equivalently, hıyı = yih3 for some h3 € H. This yields ry = 
xzıhiyıha = ziyıhgzho € xıyı H. Thus, zy € xzıyı H, and zyH = ziyi H. 
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Therefore, the operation defined above for the left cosets of H in G is 
well-defined. 


We can now easily show that if H is a normal subgroup of a group 
G, then the set of left cosets of H in G forms a group with the operation 
(cH)(yH) = (ay)H. This group, denoted G/H, is called a quotient group. 


Theorem 1.5 Suppose H is a normal subgroup of a group G. Then the 
set G/H = {xH | x € G} of left cosets of H in G forms a group with the 
operation (~H)(yH) = (xy) H. 


Proof. If e is the identity element in G, then eH = H is the identity 
in G/H since (eH)(xH) = (ex)H = xH and (xH)(eH) = (xe)H = cH 
for all x € G. Also, the inverse of the element xH in G/H is x~'H since 
(2-1H)(aH) = (a~'x)H = eH = H. The associative law in G/H can 
easily be verified. | 


Note that if G is abelian, then any subgroup H of G is normal and 
G/H is abelian. 


Example 1.6 Let G = (Z,+). Choose an integer n € Z, and let H 
be the cyclic subgroup of G generated by n. Since the operation on this 
group is addition, then H = {kn | k € Z} and additive notation «+ H is 
used for the cosets of H in G. That is, the cosets of H in G are the sets 
“+H ={x+h|he H}={a+kn|k © Z} for all x € zZ. The distinct left 
cosets of H in G are the sets H,1+H,2+H,...,(n—1)+H. Hence, G/H 
consists of these sets with the operation (x + H) + (y + H)=(a+y)+H. 
Note that if we would perform this operation without including H in the 
notation, we would simply be doing integer addition modulo n. Note also 
that G/H is cyclic with generator 1+ H. | 


Suppose H is a normal subgroup of a group G, and define the mapping 
p : G — G/H by y(x) = xH. For this mapping y, it can easily be seen 
that y(ry) = y(x)y(y) for all x,y € G. Since y satisfies this property, we 
call y a homomorphism. 


Definition 1.6 Let G and H be groups. A mapping y : G — H that 
satisfies p(xy) = y(x)p(y) for all x,y € G is called a homomorphism. 


Example 1.7 Let H be the group H = {odd, even} with identity element 
even. Define y: Sn > H by y(x) = even if x is an even permutation, and 
p(x) = odd if x is an odd permutation. Then ¢ is a homomorphism. | 
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Example 1.8 Let G be the multiplicative group of nonsingular n x n ma- 
trices over the reals (i.e., with entries in the reals). Then the determi- 
nant function is a homomorphism from G onto the multiplicative group of 
nonzero reals. | 


Let y be a homomorphism from G into H. We define the kernel of y 
to be the set Ker y = {g € G | (g) = e}, where e is the identity element 
in H. It can easily be verified that Ker y is a normal subgroup of G (see 
Written Exercise 14). Also, if H is a normal subgroup of G, and if we 
define the mapping y : G > G/H by y(x) = xH, then Ker y = H. Hence, 
every normal subgroup of a group G is the kernel of a homomorphism with 
domain G, and the kernel of every homomorphism with domain G is a 
normal subgroup of G. 


1.3 Rings and Euclidean Domains 


Let R be a set with two binary operations, an addition “+” and multipli- 
cation “x”. Suppose R also satisfies the following three properties. 


1. (R,+) is an abelian group with identity element we will denote by 0. 
2. (axb) *c= a x (bx c) for all a,b,c E€ R. 


3. ax (b+c) = (ax b)+ (ax c) and (a + b)» c= (ax c) + (bx c) for all 
a,b,c E R. 


Then R is called a ring. If also axb = bxa for all a,b € R, then R is said to 
be commutative. And if there exists a multiplicative identity element 1 € R 
for which 1 xa = a * 1 = a for all a € R, then R is said to be a ring with 
identity. As is customary, we will suppress the x from the notation when 
performing the multiplication operation in a ring. 


All of the rings we will use in this book will be commutative with 
identity. A commutative ring R with identity is called an integral domain 
if ab = 0 with a,b € R implies a = 0 or b = 0. A commutative ring R with 
identity is called a field if every nonzero element in R has a multiplicative 
inverse in R. All fields are integral domains. 


Two rings we will use extensively are the ring F|x] of polynomials in 
x with coefficients in a field F and the ring Z of integers with the usual 
operations of addition and multiplication. Both F|x] and Z are integral 
domains, but not fields. 
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Suppose B is a nonempty subset of a commutative ring R. If (B,+) 
is a subgroup of (R,+), and if rb € B for all r € Rand b € B, then B 
is called an ideal of R. If also there exists an element b € B for which 
B = {rb | r € Ry, then B is called a principal ideal. In this case we denote 
B = (b) and call B the ideal generated by b. 


If f(x) € Fla], then (f(a)) consists of all multiples of f(x) over F. 
That is, (f(x)) consists of all polynomials in F'[a] of which f(x) is a factor. 
A similar result holds for integers n € Z. We will show in Theorem 1.9 
that all ideals in F'[x] and Z are principal ideals. 


Ideals play a role in ring theory analogous to the role played by normal 
subgroups in group theory. For example, we can use an ideal of a known 
ring to construct a new ring. Suppose B is an ideal in a commutative ring 
R. Since (B,+) is a subgroup of the abelian group (R, +), it follows that 
R/B = {r+ B |r € R} is an abelian group with the addition operation 
(r+ B)+ (s+ B) = (r+s)+B. In fact, R/B is a commutative ring 
with the multiplication operation (r + B)(s + B) = (rs)+ B. To see 
that this multiplication operation is well-defined, let r+ B = rı + B and 
s+ B = sı + B for some r,r1,5,5, E€ R. Since r+ B = rı + B and 
s+ B = sı + B, then r = rı + bı and s = sı + bz for some b,b € B. 
But rs = (rı + bı)(sı + be) =718, + 11b9 + bısı + by be €rysy + B. Thus, 
rs € rısı + B, and hence, rs + B = rısı + B. Therefore, the multiplication 
operation defined above for R/B is well-defined. The ring R/B is called a 
quotient ring. 


Suppose B is an ideal in a commutative ring R, and we define the 
mapping y: R — R/B by y(x) = x + B. For this mapping y, it can easily 
be seen that (rs) = y(r)y(s) and y(r + s) = y(r) + y(s) for all r,s € R. 
Since y satisfies these properties, we call y a ring homomorphism. 


Definition 1.7 Let R and S be rings. A mapping yp: R — S that satisfies 
p(rs) = y(r)p(s) and y(r +s) = y(r)+ (s) for allr,s € R is called a ring 
homomorphism. We define the kernel of y as Ker p = {r € R | y(r) = 0}. 


Proposition 1.8 Let R and S be commutative rings, and suppose y is a 
ring homomorphism from R onto S. Then the following statements hold. 


1. If B is an ideal in R, then the set p(B) = {y(r) € S |r € B} is an 
ideal in S. 


2. If B is an ideal in S, then the set p(B) = {r € R | y(r) € B} is 
an ideal in R. 


Proof. Exercise. E 
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If every ideal in an integral domain D is a principal ideal, then D is 
called a principal ideal domain. 


We will represent the nonzero elements in a set S by S*. Let D be 
an integral domain, and let N be the set of nonnegative integers. Suppose 
there is a mapping 6 : D* — N such that for a € D and b € D*, there 
exists q,r € D for which a = bq +r with r = 0 or (r) < 6(b). Then D 
is called a Euclidean domain. Two examples of Euclidean domains are the 
ring F|x] of polynomials over a field F with 6(f(x)) = deg f(x), and the 
ring Z of integers with 6(n) = |n]. 


Theorem 1.9 Suppose D is a Euclidean domain. Then D is a principal 
ideal domain. 


Proof. Let B be a nonzero ideal in D, and let b € B such that (b) is 
the minimum of all 6(2) with x € B. Then choose a € B. Since D is a 
Euclidean domain, there exists q,r € D such that a = bq + r with r = 0 or 
6(r) < 6(b). But since r = a — bq and B is an ideal, then r € B. By the 
choice of b, it follows that r = 0. Therefore, a = bq, and a € (b). Hence, 
B C (b), but certainly (b) C B, so B = (b). | 


If an element a in an integral domain D has a multiplicative inverse 
in D, then a is called a unit. We will denote the set of units 
in an integral domain D by U(D). For example, U(Z) = {1,—1}, and 
U(F|a]) = {f(x) | f(x) is a nonzero constant in F}. Elements a,b € D are 
called associates if a = ub for some unit u € D. The only associates of an 
element n € Z are n and —n. The associates of a polynomial f(x) € F[a] 
are cf (x) for any nonzero c € F. 


For elements a and b in an integral domain D, suppose there exists 
x € D for which ax = b. Then a is said to divide b, written alb. 


Proposition 1.10 Let a, b, and c be elements in an integral domain D. 
Then the following statements hold. 


1. If alb and dlc, then alc. 
. alb and bla if and only if a and b are associates in D. 


. alb if and only if (b) C (a). 


mM ww 


(a) = (b) if and only if a and b are associates in D. 
Proof. Exercise. E 
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A nonzero element a in a Euclidean domain D is said to be irreducible 
if for all b € D, bla implies b is a unit or b is an associate of a. An ideal 
M in a Euclidean domain D with M ¥ D is said to be maximal if for all 
ideals Bin D, MC BC D implies B= M or B = D. 


Theorem 1.11 An element a in a Euclidean domain D is irreducible if 
and only if (a) is a maximal ideal in D. 


Proof. Suppose first that (a) is maximal. If bla, then (a) C (b). Hence, 
either (b) = D, in which case there exists x € D for which ba = 1 and b 
is a unit, or (b) = (a), in which case a and b are associates. Therefore, a 
is irreducible. Now, suppose a is irreducible. If (a) C (b) C D for some 
b € D, then bla. Hence, either b is a unit in D, in which case (b) = D, or 
a and b are associates in D, in which case (a) = (b). Therefore, (a) is a 
maximal ideal in D. E 


Theorem 1.12 An ideal M in a Euclidean domain D is maximal if and 
only if the quotient ring D/M is a field. 


Proof. Suppose M is a maximal ideal in D, and choose r+ M € D/M such 
that r+ M #4 M. Let B = (r+ M) C D/M, and let C = y~1(B), where 
y is the ring homomorphism from D onto D/M defined by y(x) = x + M. 
Since B is an ideal in D/M, by Proposition 1.8 we know that C is an ideal 
in D. Hence, M C C C D. But since M is maximal and r+ M # M, then 
C = D. Therefore, B = D/M. Thus, there exists an element s+ M € D/M 
for which (r + M)(s + M) = 1 + M, and so r+ M has an inverse in D/M. 
Hence, D/M is a field. Conversely, suppose D/M is a field, and let B be an 
ideal in D for which M C B C D. By Proposition 1.8, we know that y(B) 
is an ideal in D/M. Since the only ideals in a field are the field and {0} (see 
Written Exercise 16), it follows that either y(B) = M or (B) = D/M. 
Hence, either B = M or B = D, and M is maximal. ] 


By combining the results of Theorems 1.11 and 1.12, we obtain the 
following theorem. 


Theorem 1.13 Suppose a is an element in a Euclidean domain D. Then 
the following statements are equivalent. 


1. a is irreducible in D. 
2. (a) is maximal in D. 


3. D/(a) is a field. 
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1.4 Finite Fields 


Finite fields play an important role in several of the applications we discuss 
in this book. In this section, we describe the theoretical basis of construct- 
ing finite fields. Then in Section 1.5 we demonstrate how Maple can be 
used to construct finite fields. 


It can easily be shown (see below) that the ring Z, = {0,1,2,...,p—1} 
for prime p is a field with the usual operations of addition and multiplication 
modulo p (i.e., divide the result by p and take the remainder). This shows 
that there are finite fields of order p for every prime p. In the following 
discussion we show how the fields Z, can be used to construct finite fields 
of order p” for every prime p and positive integer n. A finite field of order 
p” for prime p and positive integer n is sometimes called a Galois field, 
denoted GF(p”). 


Let m be an irreducible element in a Euclidean domain D, and let 
B = (m). Then by Theorem 1.13 we know that D/B is a field. If D is the 
ring Z of integers and m > 0, then m is a prime p (see Written Exercise 23). 
Note then that if we perform the addition and multiplication operations in 
D/B without including B in the notation, these operations will be exactly 
the addition and multiplication operations in Zp. That is, we can view 
D/B as Zp. 


Now, suppose D is the integral domain Zp[x] of polynomials over Zp for 
some prime p, and let B = (f(x)) for some irreducible polynomial f(x) of 
degree n in D. Then again by Theorem 1.13, we know that D/B is a field. 
Each element in D/B is a coset of the form g(x) + B for some g(x) € Zp[z]. 
Since Z,[z] is a Euclidean domain, then there exists r(x) € Z,[x] for which 
g(x)+ B = r(x)+B with r(x) = 0 or degr(x) < n. Therefore, each element 
in D/B can be expressed as r(x) + B for some r(x) € Z,[z] with r(x) = 0 
or deg r(a) < n. Hence, the elements in D/B can be expressed as r(x) + B 
for all r(x) € Z,[x] with r(x) = 0 or degr(x) < n. Since a polynomial 
r(x) € Z,[x] with r(x) = 0 or deg r(x) < n can contain up to n terms, and 
each of these terms can have any of p coefficients (the p elements in Zp), 
then there are p” polynomials r(x) € Z,[z] with r(x) = 0 or deg r(x) < n. 
That is, the field D/B will contain p” distinct elements. The operations 
on this field are the usual operations of addition and multiplication modulo 
f(x) (ie., divide the result by f(x) and take the remainder). Because 
it is possible to find an irreducible polynomial of degree n over Zp for 
every prime p and positive integer n, this shows that there are finite fields 
of order p” for every prime p and positive integer n. It is also true that 
all finite fields have order p” for some prime p and positive integer n (see 
Theorem 1.14). 
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Suppose again that D = Z,[z] for some prime p, and B = (f(x)) for 
some irreducible polynomial f(x) € D. For convenience, when we write 
elements and perform the addition and multiplication operations in D/B, 
we will not include B in the notation. That is, we will write the elements 
r(x)+ B in D/B as just r(x). 


Example 1.9 Suppose D = Z3[z], and let B = (f(x)) for the irreducible 
polynomial f(x) = x? +£ +2 € Zs[a]. (Note: We can show that f(x) 
is irreducible by verifying that f(a) # 0 for all a € Z3.) Then the field 
D/B will contain the 3? = 9 polynomials in Z3[2] of degree less than 2. 
That is, D/B = {0,1,2,2,v7+1,2+2,22,2+1,27%+2}. To add 
elements in D/B we simply reduce the coefficients in Z3. For example, 
(2x + 1) + (2x + 2) = 4z +3 = x. To multiply elements in D/B we can 
use several methods. One method is to divide the product by f(x) and 
take the remainder. For example, to multiply the elements 2x + 1 and 
2x +2 in D/B, we could form (2x + 1) (2x + 2) = 4x? + 6x + 2 = z? + 2. 
Then, dividing xz? +2 by f(x), we obtain a quotient of 1 and remainder 
of —x = 2x. Hence, (2x + 1)(2% + 2) = 2x in D/B. Another method for 
multiplying elements in D/B is to use the fact that z?+2+2=0in D/B. 
Therefore, x? = —x — 2 = 2x +1 in D/B. The identity x? = 2x +1 can 
then be used to reduce powers of x in D/B. For example, we can also 
compute the product of the elements 2x +1 and 2x +2 in D/B by forming 
(2x + 1)(2£ +2) = 42? + 6x + 2 = r? +2 = (2x +1) +2 = 2g. A third 
method for multiplying elements in D/B will be described in general next 
and then illustrated in Example 1.10. ] 


A fundamental fact regarding finite fields is that the nonzero elements 
in every finite field form a cyclic multiplicative group (see Theorem 1.15). 
Suppose D = Zp[x] for some prime p, and B = (f(x)) for some irreducible 
polynomial f(x) € D. For the field F = D/B, if x is a cyclic generator 
for F*, then f(x) is said to be primitive. Hence, if f(x) is primitive, then 
all nonzero elements in F can be generated by constructing powers of x 
modulo f(a). This is useful because it allows products of elements in F to 
be formed by converting the elements to their representations as powers of 
x, multiplying the powers of xz, and then converting the result back to an 
element in F. This is illustrated in the following example. 


Example 1.10 Consider the field D/B in Example 1.9. In this field we 
can use the identity x? = 2x + 1 to construct the elements that corre- 
spond to powers of x. For example, we can construct the field element that 
corresponds to z? as follows. 


r? = zg? = «(22 +1) = 22? + x = 2(22 +1) + x = 5x +2 = 2z + 2 
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Hence, z? = 2x +2 in D/B. And we can construct the field element that 
corresponds to xf as follows. 


xÝ = rg? = «(Qe + 2) = 22? + 2x = 2(2x + 1) + 2x = 6r +2 = 2 
Therefore, x* = 2 in D/B. The field elements that correspond to subse- 
quent powers of x can be constructed similarly. We list the field elements 
that correspond to the first 8 powers of x in the following table. 


Power Field Element 

x! x 

x? 2z +1 

z’ 2z +2 

zt 2 

x° 2x 

xê xa+2 

xt x+1 

x8 1 


The only element in D/B not listed in this table is 0. Since all nonzero 
elements in D/B are in the cyclic group generated by x, then f(x) = 
x? + 2+ 2 is primitive in Z3[z]. 


The preceding table is useful for computing products in D/B. For 
example, we can form the product of the elements 2x + 1 and 2x + 2 in 
D/B as follows. 


(2¢ + 1)(2£ +2) = 2?2? = 2° = 22 


Note that this matches the product obtained in Example 1.9. And we can 
form the product of the elements 2x and x + 2 in D/B as follows. 


Qr)(e+2)—2° = r! = rôz’ = 1z? = 2z + 2 


Other products in D/B can be formed similarly. | 


Example 1.11 Suppose D = Z3[z], and let B = (f(x)) for the polynomial 
f(z) = £? +1 € Zs[a]. Since f(x) is irreducible in Z3[z], then D/B is a 
field of order 3? = 9 (with the same elements as the field in Example 
1.9). However, note that z? = —1 = 2 in D/B, and thus zf = 4 = 1 
in D/B. Hence, computing powers of x will not generate all 8 nonzero 
elements in D/B. Therefore, f(x) = z? + 1 is not primitive in Zs[z], 
and we cannot compute all possible products in D/B using the method 
illustrated in Example 1.10. However, we can still compute all possible 
products in D/B using the methods illustrated in Example 1.9. E 
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We close this section by proving two fundamental results we have men- 
tioned regarding finite fields. 


Theorem 1.14 Suppose F is a finite field. Then |F| = p” for some prime 
p and positive integer n. 


Proof. Let H be the additive subgroup of F generated by 1. Suppose 
|H| = mn for some positive integers m,n with m #4 1 and n Æ 1. Then 
0 = (mn)1 = (m1)(n1). But since m1 4 0 and n1 Æ 0, this contradicts the 
fact that F is a field. Hence, |H| = p for some prime p. That is, H = Z, 
for some prime p. The field F can then be viewed as a vector space over 
H with scalar multiplication given by the field multiplication, so F has a 
basis with a finite number of elements, say n. The order of F is the number 
p” of linear combinations of these basis elements over Zp. a 


Theorem 1.15 Let F be a finite field. Then F* is a cyclic multiplicative 
group. 


Proof. Clearly, F* is an abelian multiplicative group. To show that F* 
is cyclic, we use the first of the well-known Sylow Theorems, which states 
that for a finite group G of order n, if p* divides n for some prime p and 
positive integer k, then G contains a subgroup of order p*. Suppose |F* 
has prime factorization pj p5? ---p;', and let S; be subgroups of order p;* 
in F* for each i = 1,2,...,¢. Let ki == for each 1 = 1,2,...,t. Then, 
if S; is not cyclic for some i, it follows that a® = 1 for all a € S;. Hence, 
f(z) = x* — 1 has p?’ roots in F, a contradiction. Thus, each S; must 
have a cyclic generator a;. Let b = aja2---a,. Since b has order |F*|, then 
b is a cyclic generator for F*. E 


1.5 Finite Fields with Maple 


In this section, we show how Maple can be used to construct the nonzero 
elements in a finite field Z,[z]/(f(x)) for prime p and primitive polynomial 
f(x) € Z,[z] as powers of x. We consider the field in Example 1.10. 


We begin by defining the polynomial f(x) = x? + x +2 € Z3[z] used 
to construct the field elements. 


Sf ee a eA Boe oF DS 


f=xr> r H+r++2 
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We can use the Maple Irreduc function to verify that f(x) is irreducible in 
Zə|x]. The following command will return true if f(a) is irreducible modulo 
3, and false if not. 

> Irreduc(f(x)) mod 3; 


true 


Hence, f(x) is irreducible in Z3[z], and Z3[2]/(f(a)) is a field. However, in 
order for us to be able to construct all of the nonzero elements in this field 
by computing powers of x, f(x) must also be primitive. We can use the 
Maple Primitive function to verify that f(x) is primitive in Z3[x]. The 
following command will return true if f(x) is primitive modulo 3, and false 
if not. 


> Primitive(f(x)) mod 3; 


true 
Therefore, f(x) is primitive in Z3[z]. 


To construct elements in Z3[x]/(f(x)) as powers of z, we can use the 
Maple Powmod function. For example, the following command returns zê 
modulo f(z). 


> Powmod(x, 6, f(x), x) mod 3; 


r+2 


In the preceding command, the polynomial x given by the first parameter 
is raised to the power 6 given by the second parameter, with the output 
displayed after the result is reduced modulo the third parameter f(x) (de- 
fined over the specified modulus 3). The fourth parameter is the variable 
used in the first and third parameters. 


We will now use a Maple for loop to construct and display all of the 
8 nonzero elements in Z3|[x]/(f(a#)) and corresponding powers of x. In the 
following commands, we store the results returned by Powmod for each of 
the first 8 powers of x in the variable temp and display these results using 
the Maple print command. Note where we use colons and semicolons in 
this loop. Note also that we use back ticks ”‘” in the print statement. 

> for i from 1 to 8 do 

> temp := Powmod(x, i, f(x), x) mod 3: 

> print(x*i, ‘ Field Element: ‘, temp); 

> od: 


T, Field Element: x 


x, Field Element: ,2x+1 
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z’, Field Element: ,2x+2 
zÍ, Field Element: ,2 
z5, Field Element: ,2x 


x, Field Element: ,x+2 


x’, Field Element: ,x+1 
x, Field Element: ,1 


Note that these results match those listed in Example 1.10 for the nonzero 
elements in Z3[2]/(f(2)). 


1.6 The Euclidean Algorithm 


Let a and b be nonzero elements in a Euclidean domain D, and consider 
an element d € D for which dla and d|b. Suppose that for all x € D, if xļ|a 
and «|b, then z|d. Then d is called a greatest common divisor of a and b. 
We will use the notation d = (a,b) to represent this. 


Greatest common divisors do not always exist for two elements in a 
general ring. But as we will show in Theorem 1.16, greatest common di- 
visors do always exist for two elements in a Euclidean domain. As they 
are defined above, there is not a unique greatest common divisor of two 
elements in a Euclidean domain. For example, in the ring Z of integers, 
both 1 and —1 are greatest common divisors of any two distinct primes. 
However, it can be shown very easily that if both dı and dz are greatest 
common divisors of two elements in a Euclidean domain D, then dı and d2 
are associates in D (see Written Exercise 30). 


Theorem 1.16 Let a and b be nonzero elements in a Euclidean domain 
D. Then there exists a greatest common divisor d of a and b that can be 
expressed as d = au + bv for some u,v € D. 


Proof. Let B be an ideal in D of smallest order that contains both 
a and b. It can easily be shown that B = {ar + bs | r,s € D} (see 
Written Exercise 31). Since D is a Euclidean domain, by Theorem 1.9 
we know that D is a principal ideal domain. Hence, B = (d) for some 
d € D. Since d generates B, and a,b € B, then d|a and d|b. And since 
d € B = {ar + bs | r,s € D}, then d = au + bv for some u,v € D. Now, 
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if z|a and 2|b for some x € D, then a = zr and b = zs for some r,s € D. 
Therefore, d = au + bv = zru + zsv = z(ru + sv), and ald. | 


When considering only certain specific rings, it is often convenient to 
place restrictions on greatest common divisors to make them unique. For 
example, for elements a and b in the ring Z of integers, there is a unique 
positive greatest common divisor of a and b. And for elements a and 6 in the 
ring F'[a] of polynomials over a field F, there is a unique greatest common 
divisor of a and b that is monic (i.e., that has a leading coefficient of 1). 
Since these are the only rings we will use extensively here, for the remainder 
of this book we will assume greatest common divisors are defined uniquely 
with these restrictions. We should note that even though the greatest 
common divisor (a,b) of two integers or polynomials a and b is uniquely 
defined with these restrictions, the u and v that yield (a,b) = au + bu need 
not be unique. 


In several of the applications in this book we will need to determine 
not only the greatest common divisor (a,b) of two integers or polynomials 
a and b, but also u and v that yield (a,b) = au + bv. We will use the 
Euclidean algorithm to do this. We describe this algorithm next. 


Let a and b be nonzero elements in a Euclidean domain D, and let N 
be the set of nonnegative integers. Since D is a Euclidean domain, then 
there is a mapping 6 : D* — N for which we can find q1,r; € D with 
a = bq, + rı and rı = 0 or (r1) < (b). Suppose (rı) < 6(6). Then we 
can find qo, rg E€ D with b = rıq2 + r2 and rp = 0 or 6(r2) < 6(r1). Suppose 
6(r2) < 6(r1). Then we can find g3,r3 E D with rı = r2q3 + r3 and r3 = 0 
or 6(r3) < 6(r2). We continue this process until the first time r; = 0 (which 
is guaranteed to happen eventually since the 6(r;) form a strictly decreasing 
sequence of nonnegative integers). That is, we construct all qi, r; for the 
following equations. 


a = bgr t ri {6(r1) < 6(b)} 

b = 192+7T2 {6(r2) < 6(r1)} 

rı = reqg3+7r3 {6(r3) < 6(r2)} 
Gag E Tr-19n +7? n {6(Tn) < 6(rn-1)} 
Tn-1 = Tndrntit0 


By working up this list of equations we can see that r, divides both a 
and b. By working down the list we can see that any x € D that divides 
both a and b must also divide r,. Hence, (a,b) = rn. This technique for 
determining (a,b) is called the Euclidean algorithm. 
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We have now shown a technique for determining the greatest common 
divisor (a,b) of two integers or polynomials a and b. We must still show a 
technique for finding u and v that yield (a,b) = au + bv. To do this, we 
consider the following table constructed using the qi, r; from the preceding 
list of equations, and u;,v; we describe below. We will call this table a 
Euclidean algorithm table. 


Row Q R U V 
—1 — Try~=a u =l vı =0 
0 E ro = b Uo = 0 Vo = 1 
1 qı rı ui U1 
2 q2 T2 U2 V2 
n dn Tn Un Un 
The entries in each row i = 1,2,...,n of this table are constructed as 


follows. The q;,r; are from the it” equation 

M2 = MiG (1.1) 
in the preceding list of equations. Note that if we solve for r; in (1.1), we 
obtain the following equation. 

Tri = M-27 fi-li (1.2) 
We then construct u;i, v; by following this pattern for constructing r; from 
qi. Specifically, we construct u;i, v; from q; as follows. 

Ui = Ui—2 — Uiii (1.3) 


v = Vi—2— Vi-14Gi 1.4 


Many useful relations exist between the entries in a Euclidean algo- 
rithm table. For example, the following equation is true for all rows i. 


ri = aui+ bu; (1.5) 


Clearly, this equation is true for rows 1 = —1 and 0. To see that it is true 
for all subsequent rows, assume it is true for all rows 7 through k— 1. Then, 
using (1.2), (1.3), and (1.4), it follows that 


Tk = TR-2—1r-19k 
= (aug_2 + bug_2) — (aup—1 + bUR-1) de 
= a(up—2 — Uk—1qk) + b(Vk—-2 — Vk-1qk) 
auk + bux. 
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Specifically, rn = aun + bun. But recall, we have stated that rn = (a,b). 
Hence, for u = un and v = vn, we have (a,b) = au + bv. 


Another useful relation between the entries in a Euclidean algorithm 
table is the following equation for all 7 = —1,0,1,2,...,n—1. 
TiUi+1 T UiTi41 = (—1)' b (1.6) 


Note first that this equation is clearly true for i = —1. To see that it is true 
for all subsequent i, assume it is true for i = k— 1. Then, using (1.2), (1.3), 
and the fact that adding a multiple of a row of a matrix to another row in 
the matrix does not change the determinant of the matrix, it follows that 


Tk Uk 


TkUk+1 — Ukfk+1 = 
Tk+1 Uk+1 


Tk Uk 
Tk—1 — Tklk+1 Uk—1 — Uklk+1 
Tk Uk 
Tk-1 Uk-1 


= TRUR-1 — UkTk-1 


= —(rk—1Uk — Un—1Tk) 


= -(-1h 19 
= (-1)*b. 
Two additional relations that exist between the entries in a Euclidean 
algorithm table are the following equations for all 7 = —1,0,1,2,...,n—1. 
rivisr—VUtigd = (-1) tla (1.7) 
UVit1 — Uiii = (—1)'*! (1.8) 


These equations can be verified in a manner similar to the verification of 
(1.6) given above (see Written Exercises 32 and 33). 


We close this section with two examples in which we use the Euclidean 
algorithm to find (a,b), and a Euclidean algorithm table to find u and v 
such that (a,b) = au + bv. 


Example 1.12 In this example, we consider a = 81 and b = 64 in Z. To 
use the Euclidean algorithm to find (a,b), we form the following equations. 


8 = 64-1417 
64 = 17-3413 
17 = 13-144 
13 = 4-341 
4 = 1-4+0 
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Hence, (81,64) = 1. It can easily be verified that these equations yield the 
following Euclidean algorithm table. 


Row Q R U Vv 
—1 = 81 1 0 
0 = 64 0 1 
1 1 17 1 —1 
2 3 13 —3 4 
3 1 4 4 —5 
4 3 1 —15 19 
Thus, u = —15 and v = 19 satisfy (81,64) = 8lu + 64v. | 


Example 1.13 In this example, we consider a = xf + gz? +s and 
b = zt + xz? +z in Zo[z]. To use the Euclidean algorithm to find (a,b), we 
form the following equations. 


a = b(2?+1)+2° 

b = a(x) + (2? +2) 

a = (a +2)\(c+1)+2 
e+e = a(2+1)+0 


Therefore, (a,b) = x. The w; and v; for the resulting Euclidean algorithm 
table are constructed as follows (with all coefficients expressed in Z2). 


ur = U1—uog = 1-0(27+1) = 1 
Uy = V_-41—V0N% = 0— 1(x? aT 1) = x? +1 
uz = U-Wg = 0-Ila = 2 
vo = vw— vq? = 1—-(a?4+1)z = P+a+l1 
ug = U—ug = 1—2(r+1) = r? +r+1 
v3 = vi—v9g3 = (z? +1)-— (r? +24-1)(e4+1) = zt +zr’ 
Thus, the Euclidean algorithm table is the following. 
Row Q R U V 
—1 — zê +r? +r 1 0 
0 = P a e n 0 1 
1 x? +1 x? 1 x? +1 
2 x r? +r x r? +r+1 
3 op x w+atl af+r’ 
Hence, u = x? + x +1 and v = zt + x° satisfy (a,b) = au + bv. | 
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Written Exercises 
1. Let A(S) be the set of bijections on a set S, and let o be the compo- 
sition operation on A(S). Show that (A(S),0°) is a group. 
2. Show that |Sn| = n! for the symmetric group Sn. 


3. Consider the following elements in S¢: 


fi a3 
a: 4362 


a ) 
So 


(a) Find the elements ao 8 and 8 o y in Se, where o represents the 
composition operation. 


e n 
om 
NY 


aw Dw 
N eA N eA 
OU o w or 


6 
5 
6 
3 


(b) Express a, 8, and y as a cycle or product of disjoint cycles. 
(c) Is ao y even or odd? 
(d) Find the inverses of a, 8, and y. 


(e) Express a, 3, and 7 as a product of transpositions. 
4. Find the elements in the alternating group A4. 
5. Find the elements in the dihedral group D3. 
6. Find the elements in As N Ds. 
7. Find the distinct left cosets of A4 in S4. 
8. Show that As is cyclic. 
9. Find the order of the following elements. 


The 144° rotation in Ds. 
The 144° rotation in Dio. 


(a) 

(b) 

(c) Reflection across horizontal in Dio. 

(d) The element a in Written Exercise 3. 
) 


(e) The element (123)(45)(67) in Az. 


10. Show that if a group G is cyclic, then G is abelian. 
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11. 
12. 
13. 
14. 


15. 
16. 
17. 


18. 


19. 


20. 


21. 


22. 
23. 
24. 


25. 


Show that if H is a subgroup of a cyclic group, then H is cyclic. 
Show that if H is a subgroup of a cyclic group G, then G/H is cyclic. 
Find the kernel of the homomorphisms in Examples 1.7 and 1.8. 


Let G and H be groups, and suppose y : G — H is a homomorphism. 
Show that Ker y is a normal subgroup of G. 


Show that A, is a normal subgroup of Sn. 
Show that the only ideals in a field F are F and {0}. 


Let a be an element in a field F. Define the mapping y: Fa] > F 
by y(f(x)) = f(a). Show that y is a ring homomorphism, and find 
Ker y. 


Prove Proposition 1.8. 


Show that the ring F'[a] of polynomials over a field F is a Euclidean 
domain with the function 6(f(x)) = deg f(z). 


Is it true that all ideals in the ring F[z] of polynomials over a field F 
are principal ideals? State how you know. 


Show that the ring Z of integers is a Euclidean domain with the 
function 6(n) = |n]. 


Prove Proposition 1.10. 
Find all irreducible elements in the ring Z of integers. 


Perform the following calculations. 


(a) (x +2) + (2x + 2) in the field D/B in Examples 1.9 and 1.10. 
(b) (x + 2)(2a + 2) in the field D/B in Examples 1.9 and 1.10. 
(c) (a + 2) + (2x + 2) in the field D/B in Example 1.11. 
(d) (x + 2)(2a + 2) in the field D/B in Example 1.11. 

Let f(v) =a? +242. 


(a) Show that f(a) is primitive in Z3[a] by constructing the field 
elements that correspond to powers of x in Z3[2]/(f(a)). 

(b) Show that f(x) is primitive in Z5[xz] by constructing the field 
elements that correspond to powers of x in Zs5[2]/(f(a)). 

(c) Show that f(x) is not primitive in Z,;[2] by showing that f(x) 
is not irreducible in 21; [2]. 
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26. 


27. 


28. 


29. 


30. 


31. 


32. 


33. 
34. 


35. 


Show that f(x) = z? +s+ 1 is primitive in Z2[x] by constructing the 
field elements that correspond to powers of x in Zaļfx]/(f(x)). 


Show that f(x) = x? + 2? + 1 is primitive in Z.[x] by constructing 
the field elements that correspond to powers of x in Z2[2]/(f(a)). 


Show that f(x) = 2++a-+1 is primitive in Z2[x] by constructing the 
field elements that correspond to powers of x in Z[x]/(f(a)). 


Let f(x) = ett+a3ta?taetl, gz) = xt +x? +r? 4+1, and 
h(x) = zt + x? +1. In Zo[z], one of the polynomials f(x), g(x), and 
h(a) is primitive, one is irreducible but not primitive, and one is not 
irreducible. Which is which? Explain how you know. For the poly- 
nomial that is irreducible but not primitive, find the multiplicative 
order of x. 


Show that if dı and d2 are greatest common divisors of two elements 
in an integral domain D, then dı and də are associates in D. 


Let a and b be elements in an integral domain D, and let B be an 
ideal in D of smallest order that contains both a and b. Show that 
B = {ar + bs | r,s € D}. 


Verify Equation (1.7). 
Verify Equation (1.8). 


Use the Euclidean algorithm to find (2272,716), and use a Euclidean 
algorithm table to find u and v such that (2272, 716) = 2272u + 716v. 


Let a = 2° + xf + x? +27 and b = qf + x? +z +1 in Z[z]. Use 
the Euclidean algorithm to find (a,b), and use a Euclidean algorithm 
table to find u and v such that (a,b) = au + bv. 


Maple Exercises 


. Find a primitive polynomial of degree 4 in Z3[æ], and use this poly- 


nomial to construct the nonzero elements in a finite field. 


. Find a primitive polynomial of degree 2 in Z4ı[x], and use this poly- 


nomial to construct the nonzero elements in a finite field. 


. Construct the nonzero elements in a finite field of order 128. 


. Construct the nonzero elements in a finite field of order 127. 
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Chapter 2 


Block Designs 


Suppose a magazine editor wishes to compare seven cars by evaluating the 
responses of seven consumers to a series of questions regarding topics such 
as handling and comfort. The most obvious way for the editor to obtain a 
valid comparison of the cars would be to have each of the consumers test 
each of the cars. However, for various reasons such as time or monetary 
constraints, it may not be feasible to have each of the consumers test each 
car. The most convenient way to obtain a comparison of the cars would be 
to have each of the consumers test just one of the cars. But this might not 
yield a valid comparison of the cars due to potential differences among the 
consumers. In this chapter, we discuss some techniques the editor could 
use to ensure a testing scheme that is both fair and reasonable. 


2.1 General Properties of Block Designs 


Let B,,...,By be subsets of a set S = {a1,...,a,}. We will call the 
elements a; objects and the subsets B; blocks. This collection of objects 
and blocks is called a balanced incomplete block design if it satisfies the 
following conditions: 


1. Each block contains the same number of objects. 
2. Each object is contained in the same number of blocks. 
3. Each pair of objects appears together in the same number of blocks. 
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For convenience, we will refer to balanced incomplete block designs as just 
block designs. A block design is described by parameters (v, b,r, k, A) if it 
has v objects and b blocks, each object is contained in r blocks, each block 
contains k objects, and each pair of objects appears together in À blocks. 


In all of the (v, b, r, k, A) block designs we consider in this book, we will 
assume k < v and A > 0. These restrictions are harmless, for clearly k < v, 
and k = v corresponds to the case when each block contains all of the ob- 
jects. With regard to the example in the introduction to this chapter, this 
represents the possibly infeasible case when each of the consumers (repre- 
sented by the blocks) tests each of the cars (represented by the objects). 
Also, clearly A > 0, and À = 0 corresponds to the case when each block 
contains only one object. With regard to the example in the introduction 
to this chapter, this represents the possibly invalid case when each of the 
consumers tests just one of the cars. 


Example 2.1 Suppose a magazine editor wishes to obtain a fair and 
reasonable comparison of seven cars by evaluating the opinions of 
seven consumers. If we represent the cars by the elements in the set 
S = {1,2,3,4,5,6,7}, then each consumer can be represented by a block 
containing the cars to be tested by that consumer. For example, the subsets 
{1,2,4}, {2,3,5}, {3,4,6}, {4,5,7}, {5, 6,1}, {6,7,2}, and {7,1,3} of S are 
the blocks in a (7,7,3,3,1) block design, indicating that the first consumer 
should test cars 1, 2, and 4, the second consumer should test cars 2, 3, and 
5, and so forth. Note that in this block design, each car is tested three 
times, each consumer tests three cars, and each pair of cars is tested by the 
same consumer once. Therefore, this design yields a valid comparison of 
the cars while requiring only 21 total tests (versus 49 tests if each consumer 
tests each car). r] 


In this chapter we discuss several techniques for constructing block 
designs, including one that yields the design in Example 2.1. Before dis- 
cussing these techniques, we first mention some general properties of block 
designs. 


Theorem 2.1 The parameters in a (v,b,r,k, A) block design satisfy the 
equations ur = bk and (v—1)A=r(k—-1). 


Proof. To show that the equation vr = bk holds, we consider the set 
T = {(a, B) | a is an object in block B}, and count |T| in two ways. First, 
the design has v objects that each appear in r blocks. Hence, |T| = vr. But 
the design also has b blocks that each contain k objects. Hence, |T| = bk. 
Thus, vr = bk. To show that (v — 1)A = r(k — 1), we choose an object ao 
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in the design. Then for U = {(x, B) | x is an object with ao in block B}, 
we count |U| in two ways. First, there are v — 1 objects in the design that 
each appear in A blocks with ao, so |U| = (v—1)A. But there are also r 
blocks in the design that each contain ag and k — 1 other objects. Hence, 
|U| = r(k — 1). Thus, (v — 1)à = r(k — 1). | 


For a block design with objects aj,...,@y and blocks B,,..., By, let 
A = (aij) be the v x b matrix for which aj; = 1 if a; € B}, and aj; = 0 if 
ai ¢ Bj. Then A is called an incidence matrix for the design. 


Example 2.2 The following is the incidence matrix for the block design 
in Example 2.1 with objects and blocks taken in order of appearance. 


1000 10 1 
1 100010 
0 1 1000 1 
1 01 11000 
0 101100 
0 0 10 1 10 
0 0 0 10 1 1 


In this chapter we use incidence matrices for two purposes. In Section 
2.2 we use them to construct block designs. In this section we use them to 
prove some general results about block designs. 


Let A be an incidence matrix for a (v, b, r, k, A) block design. Note that 
the dot product of any row i of A with itself will be equal to the number r 
of blocks in the design that contain a;. Note also that the dot product of 
any two distinct rows 7 and j of A will be equal to the number A of blocks 
in the design that contain both a; and aj. Since the matrix AA‘ can be 
viewed as containing the dot product of every row of A with itself and all 
other rows of A, then 


r oÀ À 
A TE wer LA 

AA = aan . = (r—A)I +A, 
A Aà r 


where I is the v x v identity matrix, and J is the v x v matrix of all ones. 
Lemma 2.2 Let B be av xv matriz such that B = (r — A)I+ AJ, where 
I is the v x v identity matrix and J is the v x v matrix of all ones. Then 


det B = (r — A)®TÐ (r + (v — 1)A). 
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Proof. Note first that B must have the following form. 


r Xr... A 

Ar AÀ 
B= 

A A wae 7 


Subtracting the first column of B from each of the remaining columns of B 
yields the following. 


r à- r àr 

Aà r—-Xr 0 
Bı = : 

AÀ 0 r= À 


Then, adding each row of Bı except the first to the first row of Bı yields 
the following. 


r+(v-l1)AX 0 .. 0 
AÀ r—-XA ... 0 
Bo = 
AÀ 0 a THX 


Since Bg is triangular, det Bə is equal to the product of the diagonal entries 
of By. Hence, det By = (r — A)®Ð (r + (v—1)A). But det B = det B2. 
Thus, det B = (r — )@-Y(r + (v — 1)A). a 


Theorem 2.3 The parameters in a(v,b,r,k, A) block design satisfy the in- 
equalities v < b and k <r. 


Proof. Let A be an incidence matrix for the design. Since k < v, Theorem 
2.1 implies \ < r. Then by Lemma 2.2, we know det AAt Æ 0. Since the 
rank of a product is at most the minimum rank of the factors, it follows 
that rank A > rank AAt = v. Hence, since A is of size v x b, we know that 
v <b. And then by Theorem 2.1 we know that k < r. | 


A block design is said to be symmetric if it has the same number of 
objects and blocks. That is, a (v,b,r,k,A) block design is symmetric if 
b = v which by Theorem 2.1 implies k = r. The block design in Example 
2.1 is symmetric. 


Theorem 2.4 In a (v,v,r,r, A) block design, each distinct pair of blocks 
contains A objects in common. 
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Proof. Let A be an incidence matrix for the design. By Lemma 2.2 we 
know that A must be nonsingular. Also, for the v x v matrix J of all ones, 
it follows that AJ = JA since each entry in both products will be r. Now, 
since AA’ = (r — AJI + AJ for the v x v identity matrix I, and AJ = JA, 
then 

AAA = ((r—A)I+AJ)A = A((r — AM + AJ) = AAA*. 


Since A is nonsingular, it can be canceled from the left of both sides of the 
equation AA‘A = AAA, leaving AtA = AAt = (r — A)I + AJ. Thus, the 
dot product of any two distinct columns of A (the off-diagonal entries of 
AtA) will be equal to A. Hence, each distinct pair of blocks in the design 
will contain objects in common. E 


Theorem 2.4 states that in a symmetric block design, the number of 
objects contained in common in each pair of blocks will be equal to the 
number of blocks that contain each pair of objects. Thus, in the block 
design in Example 2.1, each pair of consumers will test the same car once. 


2.2 Hadamard Matrices 


In this section we show how Hadamard matrices can be used to construct 
block designs. Ann xn matrix H is called a Hadamard matriz if the entries 
in H are all 1 or —1, and HH‘ = nI for the n x n identity matrix T. 


For an n x n Hadamard matrix H, since +H’ = H—', then H*H = nl. 
Since HH* = H'H = nI, we see that the dot product of any row or column 
of H with itself will be equal to n, and the dot product of any two distinct 
rows or columns of H will be equal to 0. Thus, changing the sign of each 
entry in a row or column of H will yield another Hadamard matrix. A 
Hadamard matrix H is said to be normalized if the first row and column 
of H contain only positive ones. Therefore, every Hadamard matrix can 
be converted into a normalized Hadamard matrix by changing the signs of 
the entries in the necessary rows and columns. Because the first row and 
column of a normalized Hadamard matrix H contain only positive ones, all 
other rows and columns of H must contain the same number of positive 
and negative ones. Thus, for a Hadamard matrix H of order n, if n > 1, 
then n must be even. In fact, if n > 2, then n must be a multiple of 4, 
since for H = (h;;), 


X (hay + hog )(ay + has) = X hij = n, 
j 


J 


and (hij + ha;) (hay + hs;) = 0 or 4 for each Ji 
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The only normalized Hadamard matrices of orders one and two (i.e., 


1 1 
1-1 |: Also, 


of sizes 1 x 1 and 2 x 2) are Hy = | 1 ] and H = | 
H: H; 
MES | E 
construction of H4 from H> generalizes. Specifically, if H is a (normalized) 
H 
H -H 
trix (see Written Exercise 6). This shows that there are Hadamard matrices 
of order 2” for every positive integer n. 


is a normalized Hadamard matrix of order four. This 


Hadamard matrix, then | is also a (normalized) Hadamard ma- 


We are interested in Hadamard matrices because they provide us with a 
method for constructing block designs. For a normalized Hadamard matrix 
H of order 4t > 8, if we delete the first row and column from H, and change 
all negative ones in H to zeros, the resulting matrix will be an incidence 
matrix for a (4t — 1, 4t — 1, 2t —1,2t — 1,t — 1) block design. We state this 
as the following theorem. 


Theorem 2.5 Let H be a normalized Hadamard matrix of order 4t > 8. 
If the first row and column of H are deleted, and all negative ones in H 
are changed to zeros, the resulting matrix will be an incidence matrix for a 
(4t — 1,4t — 1,2t — 1,2t — 1,t — 1) block design. 


Proof. Delete the first row and column from H, change all negative ones 
in H to zeros, and call the resulting matrix A. Each row and column of H 
except the first will contain 2t ones. Therefore, each row and column of A 
will contain 2t — 1 ones. Hence, the dot product of any row or column of A 
with itself will be equal to k = 2t — 1. Furthermore, in any pair of distinct 
rows of H excluding the first, there will be 2t positions in which the rows 
differ, t positions in which the rows both have a 1, and t positions in which 
the rows both have a —1. Thus, in the corresponding pair of rows of A, 
there will be t — 1 positions in which the rows both have a 1, so the dot 
product of any two distinct rows of A will be equal to A = t— 1. Therefore, 
AAt = (k—A)I+AJ, where I is the (4t—1) x (4t—1) identity matrix, and 
J is the (4t — 1) x (4t — 1) matrix of all ones. Since also JA = kJ, then 
we know A is the incidence matrix for a (4t — 1,4t — 1, 2t — 1,2t — 1,t — 1) 
block design. ] 


Example 2.3 Consider the normalized Hadamard matrix 


t= | 7 | 


Hy, —-Hy 
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of order 8, where H; is the normalized Hadamard matrix of order four 
constructed previously. Using H = Hg, Theorem 2.5 states that 


010101 0 
1 0 01 1 0 0 
001100 1 
A= 1 1 100 0 0 
0 1 00 1 0 1 
100001 1 
0 0 10 1 10 


is the incidence matrix for a (7,7,3,3,1) block design. Note that this 
incidence matrix is not the same as the incidence matrix in Example 2.2 
for the (7,7,3,3,1) block design in Example 2.1. E 


2.3 Hadamard Matrices with Maple 


In this section, we show how Maple can be used to construct the Hadamard 
matrices Hən and corresponding block designs discussed in Section 2.2. We 
consider the design resulting from the incidence matrix in Example 2.3. 


Because some of the commands we will use are in the Maple linalg 
linear algebra package, we begin by including this package. 
> with(linalg): 


Next, we define the Hadamard matrix Hı = [ 1 |. 
> H1 := matrix(1, 1, [1]); 


Recall that the Hadamard matrix Ho. can be constructed as a block ma- 
trix from Həx-ı. Hence, the Hadamard matrices Hə, H4, and Hg can be 
constructed using the Maple blockmatrix command as follows. t 

> H2 := blockmatrix(2, 2, [H1, H1, H1, -H1]); 


lMaple V Release 5 is the first release of Maple that requires brackets [ ] to be 
included in the blockmatrix command around the matrices that form the blocks. For 
example, to construct the matrix Hə with an earlier release of Maple, the blockmatrix 
command must be entered as follows. 
> H2 := blockmatrix(2, 2, Hi, H1, Hi, -H1); 
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> H4 := blockmatrix(2, 2, [H2, H2, H2, -H2]); 


1 1 1 1 
1 —1 1 —1 
a 1 1 -1 -l 
1 -1 -1 1 


> H8 := blockmatrix(2, 2, [H4, H4, H4, -H4]); 


1 1 1 1 1 1 1 
1 -l 1 —1 1 —1 1 —1 
1 1 -1 -l 1 t =i —=1 
1 -1 -1 1 1 -1 -l 1 
es 1 1 1 1 1 1 1 1 
1 —1 1 -1 -1 1 —1 1 
1 1 1 1 1 1 1 1 
1 -1 -l 1 -l 1 1 -l 


The first two parameters in the preceding blockmatrix commands are the 
dimensions of the result in terms of blocks. The remaining parameters are 
an ordered list of the blocks by rows. Normalized Hadamard matrices of 
higher orders can be constructed similarly. 


We will now construct the incidence matrix shown in Example 2.3 that 
results from the Hadamard matrix Hg. We first delete the first row and 
column from Hg by applying the Maple delrows and delcols commands 
as follows. 

> A := delrows(H8, 1..1): 

> A := delcols(A, 1..1): 


We can then obtain the incidence matrix by changing all negative ones in 
A to zeros. To do this, we define the following function f. 
> f := x -> if x = -1 then 0 else 1 fi: 


We then apply the function f to each of the entries in A by entering the 
following map command. 
> A := map(f, A); 


010101 0 
1 00 1 1 0 0 
001100 1 
A:=]|1 110 0 0 0 
0 1 00 1 0 1 
1 0 0 0 0 1 1 
0 0 10 1 10 
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Note that the preceding matrix is the incidence matrix from Example 2.3. 


Finally, we will use Maple to list the objects that are contained in each 
of the blocks in the design. To do this, we first assign the general block 
design parameters as follows. 


SoS: 
> b:= 7: 
>k := 3: 


Since each of the blocks in the design will contain k objects, we create the 
following vector block of length k in which to store the objects contained 
in each block. 

> block := vector(k); 


block := array(1..3,[ ]) 


By entering the following commands, we can then see the objects that are 
contained in each block. In these commands, the outer loop spans the 
columns of A, and the inner loop spans the rows of A. 


> for j from 1 to b do 

> bct := 0: 

> for i from 1 to v do 

> if A[i,j] = 1 then 

> bct := bct + 1; 

> block[bct] := i; 

> fi; 

> od: 

> print (‘Block ss juts contains objects ‘, block); 
> 


od: 
Block ,1, contains objects ,|2,4,6 


Block ,2, contains objects ,|1,4,5 


Block , contains objects ,|3,4,7 


T 


Block , 


, contains objects ,[1,2,3 


Block ,6, contains objects ,[1,6,7 


2 
3 
4 
Block ,5, contains objects ,[2,5,7 
6 
7 


Block ,7, contains objects ¿[3,5,6 


In the preceding commands, note that we use colons after both od state- 
ments. This causes Maple to suppress the output (except the output result- 
ing from the print command) after each passage through the loop. Note 
also that, as in Section 1.5, we use back ticks ”‘” in the print statement. 
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2.4 Difference Sets 


In this section we discuss some techniques for constructing block designs 
using difference sets. As we will show, these techniques yield block de- 
signs with more of a variety of parameters than the designs resulting from 
Hadamard matrices. 


Let G be an abelian group of order v, and let D be a subset of G of 
order k. If every nonzero element in G can be expressed as the difference 
of two elements in D in exactly À ways with ÀA < k, then D is called a 
difference set in G, and is described by the parameters (v, k, A). 


Example 2.4 The set D = {0,1,2,4,5,8,10} is a (15,7,3) difference set 
in Z15- | 


Example 2.5 The set D = {1,2,4} is a (7,3,1) difference set in Z7. If 
we add each element in Zy to each of the elements in D (i.e., if we form 
the sets i+ D for i = 0,1,...,6), it can easily be verified that the seven 
resulting sets are the blocks in the block design in Example 2.1 (with 0 € Z7 
represented by 7). Hence, the (7,3,1) difference set D = {1,2,4} in Z7 can 
be used to construct the (7,7,3,3,1) block design in Example 2.1. E 


The fact that a block design results from adding each element in Z7 to 
each of the elements in the difference set D in Example 2.5 is guaranteed 
in general by the following theorem. 


Theorem 2.6 Let D = {d1,...,dx} be a (v,k, A) difference set in the 
abelian group G = {91,..-, 9v}. Then the sets 


gi + D = {gi + d1, ..., gi + dk}, WS Tas ety 3] 


are the blocks in a (v, v, k, k, A) block design. 


Proof. Clearly, there are v objects in the design. Also, the v blocks g; + D 
are distinct, for if g; + D = gj + D for some i Æ j, then (g; — gj) + D = D. 
We can then find k differences of elements in D that are equal to gi — gj, 
contradicting the assumption that A < k. Now, if we add an element in D 
to each of the elements in G, the result will be the set G. Each element 
in G will appear k times among the elements g; + d; for i = 1,...,v and 
j =1,...,k. Hence, each element in G will appear in k blocks. Also, by 
construction, each block will contain k objects. It remains to be shown 
only that each pair of elements in G appears together in exactly À blocks. 
Choose distinct x,y € G. If x,y appear together in some block g + D, then 
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x = g+ di and y = g +d; for some i,j. Thus, x— y = di — dj, so x— y is the 
difference of two elements in D. Since D is a (v, k, A) difference set in G, 
x— y can be written as the difference of two elements in D in A ways. And 
since x = g + d; = h + d; implies g = h, the difference d; — dj cannot come 
from more than one block. Hence, the pair x, y cannot appear in more than 
X blocks. On the other hand, suppose x — y = d;i — d; for some 7,7. Then 
x = g + d; for some g € G, and y = z — (d; — dj) = (£ — di) + dj = g + dj. 
Thus, x and y appear together in the block g+ D. Therefore, the pair x, y 
must appear in at least A blocks. With our previous result, this implies x 
and y must appear in exactly A blocks. | 


As illustrated in Example 2.5, Theorem 2.6 gives us an easy method for 
constructing a (v, v, k, k, A) block design provided we are first able to find 
a (v, k, A) difference set. Before discussing how we can construct difference 
sets, we first generalize them as follows. 


Let G be an abelian group of order v, and let D1,..., D be subsets 
of G of order k. If every nonzero element in G can be expressed as the 
difference of two elements in the same D; in exactly \ ways with A < k, 
then the subsets D; are called initial blocks in a generalized difference set 
in G, and are described by the parameters (v, k, A). Note that for t = 1, 
this definition matches our previous definition of a difference set. 


The following theorem generalizes the method given in Theorem 2.6 
for constructing block designs from difference sets. 


Theorem 2.7 Let D,,..., D be initial blocks in a (v, k, A) generalized dif- 
ference set in the abelian group G = {91,..., 9u}. Then the sets 


gi t+ D;, GS 1p Uy FS Weed 
are the blocks in a (vu, vt, kt, k, A) block design. 


Proof. Exercise. E 


Example 2.6 The sets Dı = {1,7,11}, Do = {2, 14,3}, and Ds = {4,9,6} 
are initial blocks in a (19,3,1) generalized difference set in Z419. Theorem 
2.7 states that if we add each element in Zı9 to each of the elements in 
these initial blocks, the resulting sets will be the blocks in a (19,57, 9,3, 1) 
block design. ] 


As illustrated in Example 2.6, Theorem 2.7 gives us an easy method 
for constructing a (v, vt, kt, k, A) block design provided we are first able to 
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find t initial blocks in a (v, k, A) generalized difference set. The following 
two propositions give methods for constructing initial blocks in generalized 
difference sets. 


Proposition 2.8 Suppose v = 6t + 1 = p” for some prime p and positive 
integers n and t. Let F be a finite field of order p”, and choose a € F such 
that a is a cyclic generator for F*. Then the sets 


D; = falar t ati), i=0,...,t— 1 


are initial blocks in a (6t + 1,3,1) generalized difference set in F. 
Proof. Exercise. (See the proof of Proposition 2.9 below.) | 


Example 2.7 We can use Proposition 2.8 to construct the initial blocks 
in Example 2.6 as follows. Let F = Z19, and choose cyclic generator a = 2 
for Z{g. Since 19 = 6t + 1 implies t = 3, Proposition 2.8 yields three 
initial blocks. These initial blocks are Do = {2°,2°,2!2} = {1,7,11}, 
Dı = {21,27,21!3} aA Bland Do = {2?, 28,214} = {4,9, 6}. E 


Proposition 2.9 Suppose v = 4t + 1 = p” for some prime p and positive 
integers n and t. Let F be a finite field of order p”, and choose a € F such 
that a is a cyclic generator for F*. Then the sets 


D; = {a', at ati a ; i=0,... 


3 ? 


are initial blocks in a (4t + 1,4,3) generalized difference set in F. 


Proof. Since a is a cyclic generator for F*, the order of a is 4t. Hence, 
a% = 1, and a% Æ 1. Also, a — 1 = (a% — 1)(ař +1) = 0 implies 
a% = —1. Furthermore, at —1 Æ 0, so af —1 = aê for some s between 1 and 


4t. Forming all possible differences from the sets ta‘(a* — 1), ta‘(a’ — 1), 


+a‘(a?* — at), ta*(a®* — 1), +a (a% — at), and +a'(a** — a?*), we obtain 
the following. 

+at (at — 1) = alta’) = eae 

+a (a —1) = et(2e7*) = ar 2a 

+a‘ (a” = a‘) = +q? Fat = 1) = a’ t a qitstts 

+q' (at pe 1) = +atat (1 = at) = aittts q't3tt+s 

+a'(a®*—at) = tatat(2a**) S gets Darr! 

+ai(a3*— a2) = +ata (a°) = gititts gits 


Multiplication by aê and 2 are bijections, so these elements can be canceled 
from the preceding expressions. The only remaining elements are a’, attt, 
a%ti and attt for i = 0,...,t— 1 repeated three times each. Since these 
are all of the elements in F*, then à = 3. ] 
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Example 2.8 Suppose we wish to obtain a fair and reasonable comparison 
of 9 cars by evaluating the opinions of 18 consumers. We can use Proposi- 
tion 2.9 to construct a block design for this comparison as follows. We first 
need a finite field F of order 9 to represent the cars. For F, we will use 
the field of order 9 constructed in Example 1.10. For the cyclic generator 
a for F*, we will use the element x € F. Since 9 = 4t + 1 implies t = 2, 
Proposition 2.9 yields two initial blocks in a (9,4,3) generalized difference 
set in F. These initial blocks are Do = {1,2?, z4, xê} = {1,2x +1,2, £ +2} 
and Dı = {x, x’, £5, £7} = {x, 2x + 2,2x,£ +1}. Theorem 2.7 states that 
if we add each element in F to each of the elements in these initial blocks, 
the resulting sets will be the blocks in a (9,18,8,4,3) block design. The 
blocks in this design are listed at the end of Section 2.5. Note that in this 
block design, each car is tested 8 times, each consumer tests 4 cars, and 
each pair of cars is tested by the same consumer 3 times. E 


2.5 Difference Sets with Maple 


In this section, we show how Maple can be used to construct the initial 
blocks and corresponding block designs discussed in Section 2.4. We con- 
sider the design resulting from the initial blocks in Example 2.8. 


We begin by including the Maple linalg package and entering the 
primitive polynomial f(x) = 2? +s +2 € Z3[2] used to construct the 
elements in the finite field F. 


> with(linalg): 
> f :=x -> x72 + x42: 
> Primitive(f(x)) mod 3; 


true 


Recall that since v = 4t + 1 = 9 implies t = 2, there will be 2 initial blocks. 
We define this parameter next. 


>t := 2: 
Because the field elements are the objects that will fill the blocks, we must 
store these elements in a way such that they can be recalled. We will do 


this by storing the field elements in a vector. We first initialize a vector 
with the same number of positions as the number of field elements. 


> field := vector(4*t+1); 


field := array(1..9,[]) 


© 1999 by CRC Press LLC 


Then by entering the following commands, we generate and store the field 
elements in the vector field. Note the bracket [ ] syntax for accessing 
the positions in field. 

> for i from 1 to 4*t do 

> field[i] := Powmod(x, i, f(x), x) mod 3: 

> od: 

> field[4*t+1] := 0: 
We can view the entries in the vector field by entering the following evalm 
command. 

> evalm(field) ; 


[z,2x+1, 2x +2,2, 2x, x+2,x+1, 1, 0] 


Next, we define the number k = 4 of objects contained in each of the initial 
blocks and blocks in the design, and create a vector in which to store the 
initial blocks. 

>k := 4: 


> initblock := vector(k); 


initblock := array(1..4,[ ] ) 


We can then generate and display the initial blocks by entering the following 
commands. In these commands, the outer loop spans the initial blocks while 
the inner loop constructs the entries in each one. 


> for i from 0 to t-1 do 


> for j from 1 to k do 
> initblock[j] := Powmod(x, (j-1)*t+i, f(x), x) mod 3; 
>! od: 
5 print (‘Initial Block Os fig is ‘, initblock) ; 
> od: 

Initial Block ,0, is ,f1,2a4+1, 2,742] 

Initial Block Jl, is fa, 2v4+2,22, 241] 


In order to construct all of the blocks in the design, we first create a vector 
in which to store the blocks, and initialize a counter bct we will use to 
number the blocks. 

> block := vector(k): 

> bet := 0: 


We can then generate and display all of the blocks in the design by entering 
the following commands. In these commands, the outer loop spans the 
initial blocks while the first inner loop constructs the entries in each one. 
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The last two inner loops add each field element to each of the elements in 
the initial blocks, thus yielding the blocks in the design. 


> for i from 0 to t-1i do 


> for j from 1 to k do 
> initblock[j] := Powmod(x, (j-1)*t+i, f(x), x) mod 3; 
> od: 
> for j from 1 to 4*t+1 do 
> for h from 1 to k do 
> block[h] := (field[j] + initblock[h]) mod 3; 
> od: 
> bet := bct + 1; 
> print (‘Block "bet; í is ‘, block); 
> od: 
> od: 
Block ,1, is ,|x+1,1,£z+2,2x+2] 
Block ,2, is ,|[2~+2,a2+4+2, 22,0] 
Block ,3, is ,([2a,2,2a¢+1,]] 
Block ,4, is ,([0,2a,1,2+1] 
Block ,5, is ,(2a+1,2+1,22+2, 2] 
Block ,6, is ,[a,0,2+1,2a¢+41] 
Block ,7, is  ,[w+2,2, 2,22] 
Block ,8, is ,[2,2x+2,0, a] 
Block ,9, is  ,[1,2a~+1, 2,242] 
Block ,10, is  , [2a,2,0,2”%+1] 
Block ,11, is, [l,a,a+1, 2] 
Block ,12, is ,[2,xz+1,x+2, 0] 
Block ,13, is ,[a+2,2%+1,24”+42, 2] 
Block ,14, is ,[0,2+2, 2,1] 
Block ,15, is  , [2a+2, 1, 2, 22] 
Block ,16, is ,[2a+1,0,1,2a2+2] 
Block ,17, is ,[a+1,2a,2”+1,x742] 
Block ,18, is  ,[a,2v04+2,2a,2%+1] 
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Written Exercises 


1. Suppose a magazine editor wishes to obtain a comparison of 15 cars 
by evaluating the opinions of 15 consumers. Construct a block design 
for this comparison. List the block design parameters. 


2. Construct two different block designs with v = 13 objects. List the 
block design parameters for each one. 


3. Suppose a magazine editor wishes to obtain a comparison of 25 cars 
by evaluating the opinions of a certain number of consumers after 
each of the consumers tests 3 of the cars. Construct a block design 
for this comparison. List the block design parameters, and state what 
each parameter represents. 


4. Repeat Written Exercise 3 if the editor decides to have each of the 
consumers test 4 of the cars instead of 3. 


5. Repeat Written Exercise 3 if the editor decides to compare only 7 cars 
instead of 25. (Assume each consumer still tests 3 of the cars.) 


6. Show that if H is a Hadamard matrix, then so is | ss a ; 


7. Prove Theorem 2.7. 


8. Prove Proposition 2.8. 


Maple Exercises 


1. Suppose a magazine editor wishes to obtain a comparison of 31 cars 
by evaluating the opinions of 31 consumers. Construct a block design 
for this comparison. List the block design parameters. 


2. Suppose a magazine editor wishes to obtain a comparison of 81 types 
of candy by evaluating the opinions of a certain number of children 
after each child samples 4 of the types of candy. Construct the initial 
blocks in a block design for this comparison. List the block design 
parameters, and state what each parameter represents. 


3. Construct two different block designs with v = 121 objects. (Con- 
struct initial blocks only if you use Propositions 2.8 or 2.9.) List the 
block design parameters for each one. 


4. Construct two different block designs with v = 127 objects. (Con- 
struct initial blocks only if you use Propositions 2.8 or 2.9.) List the 
block design parameters for each one. 
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Chapter 3 


Error-Correcting Codes 


In the next three chapters we discuss several types of error-correcting codes. 
A code is a set of messages called codewords that can be transmitted be- 
tween two parties. An “error-correcting” code is a code for which it is 
sometimes possible to detect and correct errors that occur during transmis- 
sion of the codewords. Some applications of error-correcting codes include 
correction of errors that occur in information transmitted via the Internet, 
data stored in a computer, and music encoded on a compact disc. Error- 
correcting codes can also be used to correct errors that occur in information 
transmitted through space. For example, we mention in Section 3.3 how 
an error-correcting code was used in the Mariner 9 space probe when it 
returned photographs of Mars to Earth in 1972. 


3.1 General Properties of Codes 


In this chapter we consider some types of codes in which the codewords 
are vectors of a fixed length over Zə. We will denote the space of vectors 
of length n over Zə as Z5. Hence, the codes we consider in this chapter 
will be subsets of Z3 for some n. A code C in Z% is not required to be a 
subspace of Z3. If C is a subspace of Z3, then we call C a linear code. We 
discuss linear codes beginning in Section 3.5 and continuing in Chapters 4 
and 5. 


The way we will tell in general if an error occurred during the trans- 
mission of a codeword in a code C is by determining if the received vector 
is in C. Thus, because our goal is to be able to detect and correct errors 
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in received vectors, not all vectors in Z3 for some n can be codewords in 
a code C. In general, we will use the “nearest neighbor policy” to correct 
a received vector that contains errors. This means that we will assume 
the fewest possible number of errors, and correct the received vector to the 
codeword from which it differs in the fewest positions. This method of error 
correction is limited, for there is not always a unique codeword that differs 
from a received vector in the fewest positions. 


Example 3.1 Consider the code C = {(1010), (1110), (0011)} in Z$. Sup- 
pose a codeword in C is transmitted and we receive the vector rı = (0110). 
A quick search of C reveals that c = (1110) is the codeword from which rı 
differs in the fewest positions. Hence, we would correct rı to c, and assume 
that the error in rı is e = rı — c = (1000). Now, suppose a codeword 
in C is transmitted and we receive the vector rə = (0010). Since two of 
the codewords in C differ from rz in only one position, we cannot uniquely 
correct r2 using the nearest neighbor policy. Therefore, in this code C, we 
are not guaranteed to be able to uniquely correct a received vector in Z4 
even if the received vector contains only a single error. E 


To make the nearest neighbor policy error correction method more 
precise, we make the following definition. Let C be a code in Z3. For 
vectors x,y € C, we define the Hamming distance d(x,y) from z to y to be 
the number of positions in which x and y differ. Hence, if x = (x1,..., £n) 


and y = (y1, ---, Yn), then d(x,y) = X |a; — y;|. We will call the smallest 
i=1 


Hamming distance between any two codewords in a code C the minimum 
distance of C. We will denote this minimum distance by d(C), or just d 
if there is no confusion regarding the code to which we are referring. For 
example, for the code C in Example 3.1, d= 1. 


Determining the number of errors that are guaranteed to be uniquely 
correctable in a given code is an important part of coding theory. To do 
this in general, consider the following. For x € Zə and positive integer r, 
let S,(x) = {y € Z3 | d(x,y) < r}. In standard terminology, S;(x) is called 
the ball of radius r around x. Let C be a code with minimum distance d, 
and let t be the largest integer such that t < ¢. Then $;(x)NS;(y) is empty 
for every pair x,y of distinct codewords in C. If z is a received vector in 
Zy with d(u,z) < t for some u € C, then z € S;(u) and z ¢ S;(v) for all 
other v € C. That is, if a received vector z € Z% differs from a codeword 
u € C in t or fewer positions, then every other codeword in C will differ 
from z in more than t positions. Thus, the nearest neighbor policy will 
always allow t or fewer errors to be corrected in the code. The code C is 
said to be t-error correcting. 
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Example 3.2 Let C = {(00000000), (11100011), (00011111), (11111100)}. 
It can easily be seen that the minimum distance of C is d= 5. Since t = 2 
is the largest integer such that t < g, then C is 2-error correcting. E 


Suppose C is a t-error correcting code in V = Z3. We now address the 
problem of determining the number of vectors in V that are guaranteed to 


be correctable in C. Note first that for any x € V, there are (") vectors 
i 


in V that differ from x in exactly i positions. Also, any vector in V that 
differs from x in i positions will be in S;(x) provided i < t. Hence, the 


number of vectors in S;(x) will be (5) + (i) Reeeese C) To determine 


the number of vectors in V that are guaranteed to be correctable in C, 
we must only count the number of vectors in S;(x) as x ranges through 
the codewords in C. Since the sets S(x) are pairwise disjoint, the number 
of vectors in V that differ from one of the codewords in C in t or fewer 
positions, and are consequently guaranteed to be uniquely correctable in 
C, is |C]- (6) + a outa Oh The fact that |V| = 2” then yields 
the following theorem, which gives a bound on the number of vectors in Z3 
that are guaranteed to be correctable in a t-error correcting code in Zg. 
This bound is called the Hamming bound. 


Theorem 3.1 Suppose C is a t-error correcting code in Z3. Then 


TORORO E 


A code C in Z} is said to be perfect if every vector in Z3 is guaranteed 
to be correctable in C. That is, a code C in Z9 is perfect if the inequality 
in Theorem 3.1 with C is an equality. For the code C in Example 3.2, 

8 8 8 
the factors in this inequality are |C| = 4, (a) + (7) + (3) = 37, and 
28 = 256. Thus, for the code C in Example 3.2, 108 of the vectors in Z3 
are not guaranteed to be uniquely correctable in C (some may, however, 
still be “closest” to a unique codeword). Therefore, this code is far from 
perfect. In Sections 3.5 and 3.6 we discuss a class of codes called Hamming 
codes that are perfect. 


In practice, it is often desirable to construct codes that have a large 
number of codewords and are guaranteed to correct a large number of er- 
rors. However, the number of errors guaranteed to be correctable in a code 
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is obviously related to the number of codewords in the code. Indeed, to 
construct a t-error correcting code C in Zy for fixed values of n and t such 
that |C| is maximized has been an important problem of recent mathemat- 
ical interest. An equivalent problem is to find the maximum number of 
points in Z3’ such that the balls of a fixed radius around the points can be 
arranged in the space without intersecting. This type of problem is called 
a sphere packing problem. 


For the remainder of this chapter and the two subsequent chapters we 
will discuss several methods for constructing various types of codes and 
correcting errors in these codes. To facilitate this, we will establish a set 
of parameters to use in describing codes. We will describe a code by the 
parameters (n, d) if the codewords in the code are of length n positions and 
the code has minimum distance d. 


3.2 Hadamard Codes 


In Section 2.2 we showed that for certain values of v, k, and A, it is pos- 
sible to use a Hadamard matrix to construct an incidence matrix for a 
(v,v,k,k, A) block design. The following theorem states that the rows of 
such an incidence matrix form the codewords in an error-correcting code. 


Theorem 3.2 Suppose A is an incidence matrix for a (v,v,k,k, A) block 
design. Then the rows of A form a (v,2(k — )) code with v codewords. 


Proof. There are v positions in each of the v rows of A. Hence, the 
rows of A form a code with v codewords each of length v positions. It re- 
mains to be shown only that the minimum distance of this code is 2(k — A). 
Consider rows Rı and Rə in A. Since each row of A contains ones in k po- 
sitions, and each pair of rows of A contains ones in positions in common, 
there will be k — A positions in which R; contains a one and Rə contains a 
zero, and k — X positions in which these elements are reversed. This yields 
2(k — A) positions in which R, and Rg differ. | 


Example 3.3 Theorem 3.2 states that the rows of the incidence matrix A 
in Example 2.3 form a (7,4) code with 7 codewords. | 


In Theorem 2.5, we showed that a normalized Hadamard matrix H 
of order 4m > 8 can be used to construct an incidence matrix for a 
(4m—1,4m—1,2m—1,2m—1,m-—1) block design. Theorem 3.2 states that 
the rows of such an incidence matrix A form codewords of length 4m — 1 


© 1999 by CRC Press LLC 


positions in a code with minimum distance d = 2((2m — 1) — (m — 1)) = 2m 
and 4m — 1 codewords. Recall that each of the rows of A will contain 2m 
zeros and 2m — 1 ones. Hence, there will be 2m positions in which the 
vector (1 1 --- 1) of length 4m — 1 positions differs from each of the rows of 
A. Thus, by including the vector (1 1 --- 1) of length 4m—1 positions with 
the rows of A, we obtain a (4m — 1,2m) code with 4m codewords. And no 
more vectors can be included in this code without decreasing the minimum 
distance of the code (see Corollary 3.4). Because these (4m — 1,2m) codes 
with 4m codewords are constructed from Hadamard matrices, we will call 
them Hadamard codes. 


We close this section by proving the following theorem and corollary 
which verify the fact mentioned above that no vectors can be joined to the 
codewords in a Hadamard code without decreasing the minimum distance 
of the code. 


Theorem 3.3 Let r be the number of codewords in a code with parameters 
(n,d) for some n,d with d > 4. 


Proof. Let A = (a;;) be an r x n matrix with the codewords as rows, and 
let S = J` d(u,v) for all distinct pairs u,v of codewords. Now, d(u,v) > d 


u,v 


=j ; 
for all pairs u,v of codewords. Hence, S > (5) d= a d. Let 1 


and ti be the number of times that 0 and 1 appear in the it” column of 
A, respectively. Then t + ©) =r for all i. Also, 


= J 2 lay -an| = DDL lay — anl, 
Q j j Q 
where Q is the set of all distinct pairs of rows of A. For each j, 


3 |a;; — aņpj| is equal to the number of times that any two rows of A 


sone differing entries in the jt? position. This number is 1+ )40 2 so 
s=> £9) (r — a To find an upper bound on ad 4, we consider the 
J 


function SS a = a — x) for 0 < x < r. Note that f(x) is maximized at 


the point ( ) = (5.4). Hence, £4 () < eh and S$ < a Thus, 
rr < a , and S 2) < d. Therefore, r < re 5 sp. E 


Corollary 3.4 Letr be the number of codewords in a code with parameters 
(4m — 1,2m) for some m. Then r < 4m. 


Proof. Exercise. E 
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3.3 Reed-Muller Codes 


In Section 3.2, we showed that normalized Hadamard matrices can be used 
to construct error-correcting codes we called Hadamard codes. We also 
showed that the number of codewords in a Hadamard code is maximal in 
the sense that no vectors can be included in the code without decreasing 
the minimum distance of the code. As a consequence of the following 
theorem, by increasing the length of the codewords in a Hadamard code by 
one position, we can double the number of codewords in the code without 
decreasing the minimum distance of the code. 


Theorem 3.5 Suppose A is the incidence matrix constructed from a nor- 
malized Hadamard matrix of order 4m, and let B be the matrix that results 
from interchanging all zeros and ones in A. Let A be the matrix obtained 
by placing a one in front of all of the rows of A, and let B be the matrix 
obtained by placing a zero in front of all of the rows of B. Then the rows 
of A and B taken together form a (4m,2m) code with 8m — 2 codewords. 


Proof. Exercise. E 


Each of the rows in the matrices A and $ in Theorem 3.5 will contain 
2m zeros and 2m ones. Hence, there will be 2m positions in which both of 
the vectors (0 0 --- 0) and (11 --- 1) of length 4m positions differ from 
each of the rows of A and B. Thus, by including the vectors (0 0 --- 0) 
and (1 1 --- 1) of length 4m positions with the rows of A and B, we obtain 
a (4m, 2m) code with 8m codewords. And no more vectors can be included 
in this code without decreasing the minimum distance of the code. These 
(4m, 2m) codes with 8m codewords are called Reed-Muller codes. 


A Reed-Muller code was used in the Mariner 9 space probe when it 
returned photographs of Mars to Earth in 1972. The specific code used 
in the space probe was the (32,16) Reed-Muller code with 64 codewords 
constructed using the normalized Hadamard matrix H32 of order 32 (see 
Maple Exercise 1). Before being transmitted, each photograph was broken 
down into a collection of very small dots. Each dot was then assigned one 
of 64 levels of grayness and encoded into one of the 64 codewords. 


3.4 Reed-Muller Codes with Maple 


In this section, we show how Maple can be used to construct and correct 
errors in the (16,8) Reed-Muller code. 
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We begin by generating the normalized Hadamard matrix H16 of order 
16 used to construct the code.! 


> with(linalg): 


> H1 matrix(1, 1, [1]): 


> H2 := blockmatrix(2, 2, [H1, H1, H1, -Hi]): 
> H4 := blockmatrix(2, 2, [H2, H2, H2, -H2]): 
> H8 : 


blockmatrix(2, 2, [H4, H4, H4, -H4]): 
> H16 := blockmatrix(2, 2, [H8, H8, H8, -H8]): 


We can then construct the incidence matrix A that results from Hie as 
follows. 


> A := delrows(H1i6, 1..1): 

> A := delcols(A, 1..1): 

> f := x -> if x = -1 then O else 1 fi: 

> A := map(f, A); 
0 1 0 10 1 0 1 0 10 1 0 10 
1 0 0 1 1 0 0 1 1 001 100 
0 0 1 100 1 100 1 100 1 
1 1 1 0 0 0 0O 1 1 1 10000 
0 1 00 10 1 10 100 1 0 1 
1 0 0 0 O 1 1 1 1 0 000 1 1 
0 0 1 O0 1 1 0 1 0 0 10 1 10 

A:=:|1 1114141410000 00 0 0 

0 1 0 10 1 00 1 0 1 0 1 0 1 
1 0 0 1 1 00 00 1 100 1 1 
00110010110 0 1 10 
1 1 1 0 0 0 0 0 0 0 O 1 1 1 1 
0 1 0 0 1 0O 1 0 1 O 1 1 0 1 0 
1 0 0 0 O 1 1 0 0 1 1 1 100 
0 0 1 0 1 100 1 10 100 1 


Next, we construct the matrix B from Theorem 3.5 by interchanging all 
zeros and ones in the matrix A. To do this, we define and apply the 
following function g to the entries in A. 


> g := x -> if x = 0 then 1 else O fi: 


> B := map(g, A); 


1See footnote p. 33 regarding the Maple blockmatrix command. 
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1 O0 1 0 1 O 1 01 0O 1 O 1 0 1 
0 1 100 1 100 1100 1 1 
1 1 00 1 100 1 100 1 10 
0 0 0 1 1 11 110000 1 1 1 1 
1 O0 1 1 0 100 10 1 11010 
0 1 1 1 10000 1 1 11100 
1 1 0 100 10 1 10100 1 
B:=|0 0 00000 1 1 1 1 1 1 1 1 
1 O0 1 0 10 1 10 10 10 1 0 
0 1 100 1 1 11 100 1 100 
1 1 00 1 10 100 1 100 1 
0 0 0 1 1 1 1 1 1 110000 
1 O0 1 1 0O 10 10 100 1 0 1 
0 1 1 1 100 1 10000 1 1 
1 1 0 1 00 1 100 10 1 1 0 


We now construct the matrices A and B from Theorem 3.5. Recall 
that to construct A and B, we must place a one in front of all of the rows 
of A and a zero in front of all of the rows of B. To do this, we first define 
the following vectors colA and co1B. 

> colA := vector(rowdim(A), 1); 


colA := [1, IE de P O S E E A Tak 1, E I 1] 
> colB := vector(rowdim(B), 0); 
colB := [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] 


By construction, the preceding vectors both have the same number of posi- 
tions as the number of rows in A or B. Hence, by placing the vectors colA 
and co1B as columns in front of the matrices A and B respectively, we will 
obtain the matrices A and B. We can do this using the Maple augment 
command as follows. 

> scriptA := augment (colA, A): 

> scriptB := augment(colB, B): 


The rows of the matrices scriptA and scriptB taken together form all but 
two of the codewords in the (16,8) Reed-Muller code. The two codewords 
not included in the rows of these matrices are the vectors (0 0 --- 0) and 
(11 --- 1) of length 16 positions. We create these vectors next. 

> v_zero := vector(coldim(scriptB), 0): 


> v_one := vector(coldim(scriptB), 1): 


We can then view the codewords in the (16,8) Reed-Muller code by using 
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the Maple stackmatrix command? as follows to stack the matrices A and 


B and the vectors v_zero and v_one. 


stackmatrix(scriptA, scriptB, v_zero, v_one); 


CW := 


> 


1 O 1 0O 10 1 0 10 10 10 


1 0 


a a a O E S 
TG eOe a a YH 
mO Cr O mN Oe O eO 
saa ia SP | 
D o o eD OO e eA a r 
(ans DE a NE a StS AE e GE ee A e srt aS 
mononoontonor 
Sn oe oe ee a 
On Onn OoOTWooconTor 
ooocOonwnwn eT OOO co 
mHoononTW on oOo +4 
eo) Oo OOO aS eat St oO. © 
OonnwroonT oon oO 
Coon mas oon TOON dT 
wmAonWron COT OTF OT o 
Se ae en oe ee ee ee | 


0 
1 
1 
1 


S TOn 
Honno 
mE D 
Er e 
ae Or 
on oO 
ooo © 
So A o AAL a PES a 
Haon 
[a E a E oO 
ooo O 
Onn a 
ooocrn 
Honno 
HH OO 


1 0 0 1 


1 1 


1 1 1 0 0 0 O 1 1 


0 0 0 0 1 


000 0 1 


1 


1 O 1 0 0O 1 0 1 


1 
000000 0 0 1 


0 


0 1 0 10 1 10101010 


1 


0 


1 0 0 0 0 
1 1 
1 


1 00 0 0 1 


1 1 
1 0 0 
1 0 0 1 


1 
0 


1 
000000 00 0 00 0 0 0 0 0 


000 0 1 
0 1 
1 

1 


0 
0 
0 


1 0 


1 0010 1 


1 01 0 0 1 


1 


Recall that the (4m, 2m) Reed-Muller code contains 8m codewords. Hence, 


?Maple V Release 5 is the first release of Maple that uses stackmatrix to stack 


Earlier releases of Maple use the stack command to 


accomplish this. For example, with an earlier release of Maple, we would construct the 


matrix cw by entering the following command. 


matrices and vectors vertically. 


:= stack(scriptA, scriptB, v_zero, v_one); 


CW 
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there should be 32 rows in the preceding matrix. We can verify this as 
follows. 


> rowdim(cw) ; 


32 


We now show how Maple can be used to correct a vector in Z3° in 
the (16,8) Reed-Muller code. Note first that since in general the (4m, 2m) 
Reed-Muller code is (m — 1)-error correcting, the (16,8) Reed-Muller code 
will be 3-error correcting. Suppose a codeword in the (16,8) Reed-Muller 
code is transmitted and we receive the following vector. 


> r := vector([1, 0, 1, 0, 1, O, O, 1, O, 1, 1, O, 1, O, O, 0]): 


We can then use the following commands to determine if this received 
vector contains a correctable error. We first define the general Reed-Muller 
parameter m = 4, and then define two additional parameters fc and rn 
we will use in the subsequent while loop. The loop compares each of the 
rows in the matrix cw (i.e., each of the codewords) with the received vector 
r. The norm command that appears in the loop counts the number of 
positions in which each row of cw differs from r. If a row is found that 
differs from r in fewer than m positions, the variable fc is assigned the 
value 1. This terminates the loop, leaving m as the number of errors in r, 
and rn as the number of the row in cw that differs from r in fewer than 
m positions. If no codeword is found that differs from r in fewer than m 
positions, the loop ends when the rows of cw are exhausted, leaving m with 
its initial value of 4. 


>m := 4: 
> fc := 0: 
> rn := 0: 


> while (fc <> 1) and (rn < rowdim(cw)) do 


> rn := rn + 1; 

> if norm(row(cw, rn) - r, 1) < m then 
> m := norm(row(cw, rn) - r, 1): 

> fc := 1 

> fi: 

> od: 


After we execute the preceding commands we can enter the following com- 
mand to see if r contains a correctable error. 


> m; 


© 1999 by CRC Press LLC 


This value for m indicates that r contains three errors, and hence is cor- 
rectable. The following command shows that the codeword that differs 
from r in three positions is the 22”¢ row of cw. 


> rn; 


22 


We can view this codeword by entering the following command. 


> evalm(row(cw, rn)); 
[0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, O, 1] 


And we can see the positions in r that contain errors by entering the fol- 
lowing command. 


> map(x -> x mod 2, evalm(row(cw, rn) - r)); 


[1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1] 


3.5 Linear Codes 


As shown, Hadamard and Reed-Muller codes are easy to construct and can 
have significant error-correction capabilities. However, because Hadamard 
and Reed-Muller codes do not generally form vector spaces, they are not 
ideal for situations in which a very large number of codewords is needed. 
Because Hadamard and Reed-Muller codes do not generally form vector 
spaces, error correction in these codes must consist of comparing received 
vectors with each of the codewords one by one. This is the scheme we used 
to correct the received vector in Section 3.4. While this error correction 
scheme poses no real problems in small codes like the one constructed in 
Section 3.4, it would not be an efficient way to correct errors in a code with 
a very large number of codewords. In this section we discuss a method for 
constructing codes that form vector spaces. We then discuss some error 
correction schemes for these codes. 


Recall that a code that forms a vector space is called a linear code. 
We will describe linear codes by the parameters [n,k] if the codewords 
in the code are of length n positions and the code forms a vector 
space of dimension k. In this section we discuss linear codes constructed 
using generator matrices. Specifically, let W = Z$ and V = Z% with 
k < n, and let G be a k x n matrix over Z> of full row rank. Then 
C= {v € V | v = wG for some w € W} is a subspace of V of dimension 
k. Hence, the vectors in C form the codewords in an [n, k] linear code in V 
with 2 codewords. The matrix G is called a generator matrix for C. 


© 1999 by CRC Press LLC 


Example 3.4 Let W = Z2 = {(00), (10), (01), (11)}, and choose the fol- 
lowing generator matrix G. 


ex 


Then C = {(00000000000), (11110000111), (00001111111), (11111111000)} 
is the resulting [11,2] linear code. | 


Note that the code C in Example 3.4 has minimum distance d = 7. 
Hence, C is 3-error correcting, whereas errors cannot be corrected in 
W = Z3. Of course, the vectors in C are longer than the vectors in W. 
Consequently, it would take more “effort” to transmit the vectors in C. 
However, the ability to correct up to 3 transmission errors in C should 
be much more valuable than the extra effort required to transmit the vec- 
tors. Furthermore, W can still be used for encoding and decoding actual 
messages or information. The general idea we can take is that messages 
or information can be encoded in W and then converted to C before being 
transmitted. Received vectors can then be corrected in C (if necessary) and 
converted back to W to be decoded. Note that in order for this process 
to be valid, we must be able to convert between W and C uniquely. But 
this is precisely why we required G to have full row rank. Since G has full 
row rank, G has a right inverse, say B. Therefore, w € W can be retrieved 
uniquely from wG € C by w = wGB. 


We now consider the problem of detecting errors in received vectors 
that occur from codewords in linear codes constructed using generator ma- 
trices. Because linear codes are vector spaces, there are techniques for 
identifying received vectors as codewords in linear codes that are generally 
much more efficient than comparing the received vectors with each of the 
codewords one by one. For a linear code C constructed from W using gen- 
erator matrix G of size k x n, consider an (n — k) x n matrix H of full 
row rank over Z2 with HG* = 0. Since HG* = 0, then HG*tw! = 0 for all 
w €W. Hence, H(wG)t = 0 for all w € W, or, equivalently, Hc = 0 for 
all c € C. And since H has full row rank, it can be shown that Hc’ = 0 if 
and only if c € C. Thus, H can be used to identify codewords in C. The 
matrix H is called a parity check matrix for C. 


To determine a parity check matrix H from a generator matrix G, note 
that HG*t = 0 implies GH* = 0, so the columns of H+, that is, the rows of 
H, are in the null space of G. Thus, to determine H from G, we must only 
find a basis for the null space of G and place these basis vectors as rows 
in H. In practice, when constructing a linear code, it is often convenient 
to begin with a parity check matrix rather than a generator matrix. But 
since HG* = 0, then G can be determined from H in the same way H can 
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be determined from G. That is, G can be determined from H by finding a 
basis for the null space of H and placing these basis vectors as rows in G. 


Example 3.5 Consider a linear code C with the following parity check 
matrix H. 
0 00 1 1 1 1 
H = |0 1 1 0 0 1 1 
1 0 1 0 1 0 1 
To construct a generator matrix G' for C, we find a basis for the null space 


of H by considering H as the coefficient matrix for the following system of 
3 homogeneous equations in 7 unknowns. 


zı +z3+z5+z7 = 0 
£2 + £3 + ze +zır7 = 0 
z4 + z5 +z +zr7 = 0 


By solving these equations for x1, £2, and x4 in terms of the others, we can 
find a basis for the null space of H by setting each of £3, £5, £e, and x7 
equal to one while setting the others equal to zero. For example, setting 
ts = 1 and z3 = £e = £7 = 0 gives zı = 24 = 1 and z = 0. This yields 
the basis vector (1001100). This vector and the other three basis vectors 
constructed similarly form the rows in the following generator matrix G. 


1 1100 0 0 
G= 1 0 O 1 100 
0 1 01010 
1 1 O 1 0 0 1 


To construct the codewords in C, we would take W = Z} and form wG 
for all w € W. The resulting code is a [7,4] linear code with 16 codewords 
called a Hamming code. E 


The code in Example 3.5 is called a Hamming code because of the 
form of the parity check matrix H. Note that the columns of the matrix 
H in Example 3.5 are the numbers 1,2,...,7 in order expressed in binary. 
For example, the 6” column of H is [1,1,0]t, whose entries are the coef- 
ficients in the expression 6 = 1- (2?) + 1- (2t) + 0- (2°). In general, to 
construct a Hamming code, we place the binary expressions of the numbers 
1,2,...,2™ — 1 for some integer m > 1 in order as columns in a parity 
check matrix H of size m x (2™ — 1). The reason we stop at a number of 
the form 2™ — 1 is so that the columns of H will form all nonzero vectors 
of length m over Z2. The importance of this is for error correction and will 
be addressed later. From H, we determine a generator matrix G of size 
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(2™ — 1 — m) x (2 — 1) over Z2 by finding a basis for the null space of H 
over Z2. We then construct the codewords in the code by forming wG for 
all vectors w of length 2” — 1 — m over Z2. 


Example 3.6 The following is the parity check matrix H for the [15,11] 
Hamming code. 


00000001 1 1 1 1 1 1 1 
H= 0 0 0 1 1 1 10000 1 1 1 1 
0 1 100 1 100 1 1100 1 1 
1 O0 1 0 1 0O 10 1 O 1 O 1 0 1 


All Hamming codes are one-error correcting (see Corollary 3.8) and 
perfect (see Written Exercise 11). Recall that a code C in Z3 is said 
to be perfect if every vector in Zy is correctable in C. The fact that 
Hamming codes are perfect is a consequence of the discussion immediately 
preceding Theorem 3.1 regarding the number of correctable vectors in 
a t-error correcting code in Z}. For example, because the [7,4] Hamming 
code C is one-error correcting, the number of vectors in Z4 that are cor- 
rectable in Č is 


OROA OOE 


But there are only 27 = 128 vectors in Z4. Thus, every vector in Z4 is 
correctable in the [7,4] Hamming code. Hence, the [7,4] Hamming code is 
perfect. The general result is Written Exercise 11. 


We have now seen an effective method for detecting errors in received 
vectors that occur from codewords in linear codes constructed using gener- 
ator matrices. Specifically, for a linear code C with parity check matrix H, 
Hct = 0 if and only if c € C. We now consider the problem of correcting 
errors in received vectors that occur from codewords in these codes. Let 
C be a linear code in Z% with parity check matrix H. Suppose c € C is 
transmitted and we receive the vector r € Z3. Then r = c+e for some error 
vector e € Z5 that contains ones in the positions where r and c differ and 
zeros elsewhere. Note that Hrt = Hc! + He = Het, so we can determine 
Het by computing Hrt. If we can then find e from Het, we can form the 
corrected codeword as c = r + e. 


Consider again the Hamming codes. Because they are one-error cor- 
recting and perfect, the only error vectors we must consider with Hamming 
codes are the vectors e; that contain all zeros except a single one in the i” 
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position. Suppose a codeword in the [2 — 1,2 — 1 — m| Hamming code 
C is transmitted and we receive the vector r € Z3"~'. If r ¢ C, then since 
the columns in the parity check matrix H for C form all nonzero vectors of 
length m over Z2, Hrt will be one of the columns of H. Suppose Hr" is the 
jt” column of H. Since the j** column of H is also He‘, then Hr’ = He’. 
Thus, the error in r is e;. Note also that the j* column of H is the binary 
expression of the number j. Hence, if Hrt is the binary expression of the 
number j, then the error in r is ej. 


Example 3.7 Suppose a codeword in the [7, 4] Hamming code C is trans- 
mitted and we receive the vector r = (1011001). Then with the parity 
check matrix H for C in Example 3.5, Hr‘ = (001)' is the first column of 
H. Thus, the error in r is e; = (1000000), and we correct r to the codeword 
c= r + e, = (0011001) € C. E 


Example 3.8 Suppose a codeword in the [15,11] Hamming code C is 
transmitted and we receive the vector r = (101011100111000). Then with 
the parity check matrix H for C in Example 3.6, Hrt = (1011)! is the 11” 
column of H. Thus, the error in r is e113 = (000000000010000), and we 
correct r to the codeword c = r + e11 = (101011100101000) € C. | 


As mentioned, Hamming codes are one-error correcting. We now con- 
sider the problem of determining the number of errors that are guaranteed 
to be correctable in more general linear codes constructed using generator 
matrices. We discussed in Section 3.1 that we can determine the number of 
errors that are guaranteed to be correctable in a code by finding the mini- 
mum distance of the code. Specifically, in a code with minimum distance d, 
we are guaranteed to be able to uniquely correct t errors for any t < d. In 
a code with a very large number of codewords, it would not be efficient to 
find the minimum distance of the code by actually computing the Hamming 
distance between each pair of codewords. However, because linear codes 
are vector spaces, there are techniques for determining the minimum dis- 
tance that are generally much more efficient than computing the Hamming 
distance between each pair of codewords. The following Theorems 3.6 and 
3.7 provide such techniques. 


For a codeword z in a linear code constructed using a generator matrix, 
we define the Hamming weight w(x) to be the number of ones in x. That 
is, w(x) = d(x,0), the Hamming distance between x and the zero vector. 


Theorem 3.6 Let C' be a linear code constructed using a generator matriz, 
and suppose w = min{ w(x) | x € C, x 40}. Then w = d(C). 
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Proof. Since w = w(c) = d(c,0) for some c € C, it must be the case that 
d(C) < w. But d(C) = d(x,y) = w(x — y) for some x,y € C. Since C is a 
vector space, then x — y € C. Hence, w < d(C). E 


Theorem 3.7 Let C be a linear code with parity check matrix H, and 
let s be the minimum number of linearly dependent columns in H. Then 


s = d(C). 


Proof. Let w = min{w(z) | « € C, x # 0}, and suppose Ci ,..., Ci 
are linearly dependent columns in H. Then 


s 


aCi +e + aCi = 0 


for some nonzero a1,...,@s. Let x be a vector of appropriate length (the 
number of columns in H) with a; in position i; for j = 1,...,s and zeros 
elsewhere. Then Hgt = 0. Hence, x € C. Thus, s > w = d(C). Conversely, 
let y € C with w(y) = d(C), and let i1,...,iq be the positions in y that 
are nonzero. Then 0 = Hy!’ = Ci, +--+- + Cia, so columns Ci, ..., Cia are 
linearly dependent. Thus, s < d(C). | 


The fact that Hamming codes are one-error correcting can be shown 
as a corollary to Theorem 3.7. We show this next. 


Corollary 3.8 Let C be a Hamming code. Then C is one-error correcting. 


Proof. Note that the first three columns in the parity check matrix H 
for C are linearly dependent. Also, no two columns in H are linearly de- 
pendent since either they would be equal or one would be the zero vector. 
By Theorem 3.7, d(C’) = 3. Thus, C is one-error correcting. | 


We have already shown how errors can be corrected in Hamming codes. 
We now consider error correction in more general linear codes constructed 
using generator matrices. 


Let C be a t-error correcting linear code in Z3’. A subset S of 23) is 
called a coset of C if any two vectors in S differ by an element in C. Suppose 
c € C is transmitted and we receive the vector r € Z% with r = c + e for 
some nonzero error vector e. Since r and e differ by an element in C, then 
r and e are in the same coset of C. Hence, if r contains t or fewer errors, 
we can find the error vector e that corresponds to r by finding the unique 
vector with the fewest ones in the coset that contains r. In a code with a 
very large number of codewords, it would not be practical to construct all 
of the elements in the cosets. The following theorem yields an equivalence 
on vectors in the same coset for such codes. 
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Theorem 3.9 Let C be a linear code with parity check matrix H. Then 
u,v are in the same coset of C if and only if Hut = Hot. 


Proof. Exercise. E 


Theorem 3.9 states that each coset S of a linear code with parity check 
matrix H can be uniquely identified by Hut for any u € S. We will call 
Hut the syndrome of u. Suppose a codeword c in a t-error correcting linear 
code C in Zy is transmitted and we receive the vector r € Z3 with r = c+e 
for some nonzero error vector e. If r contains t or fewer errors, then we 
can find e by finding the unique vector with the same syndrome as r that 
contains t or fewer ones. And if r contains more than t errors, then the 
syndrome of r will not match the syndromes of any of the vectors in Z3 
that contain t or fewer ones. When a coset contains a unique vector with 
the fewest number of ones, we will call this vector the coset leader. Hence, 
for a t-error correcting linear code, each vector that contains t or fewer ones 
must be a coset leader. 


Example 3.9 Let W = Z2 = {(00), (10), (01), (11)}, and choose the fol- 
lowing generator matrix G. 


Then C = {(00000), (11100), (00111), (11011)} is the resulting [5,2] linear 
code. It can easily be verified that the following matrix H is a parity check 
matrix for C. 


1 1 0 0 0 
H = 1 O 1 1 0 
1 0 1 0 1 


It can also easily be verified that C is one-error correcting. Therefore, the 
only cosets leaders in Z3 for C will be the zero vector and the five vectors in 
Z3 that contain a single one. The following table shows these coset leaders 
and their syndromes. 


Coset Leader Syndrome 
(00000) (000)! 
(10000) (111) 
(01000) (100)* 
(00100) (011) 
(00010) (010)¢ 
(00001) (001) 
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Suppose a codeword c€ C is transmitted and we receive the vector 
rı = (00011) € Zł. To correct this vector, we compute Hrt = (011)*. 
Since the coset leader (00100) also has this syndrome, then the error in r 
is e = (00100). Hence, we would correct rı to c = rı + e = (00111). Note 
that because each coset for this code contains 4 vectors, only 24 of the 32 
vectors in ZŠ are in cosets that have coset leaders. For example, suppose a 
codeword in C is transmitted and we receive the vector rg = (01001). To 
correct this vector we compute Hr§ = (101)'. But none of the coset leaders 
for C also have this syndrome, so rz is not in a coset with a coset leader. 
Thus, r2 cannot be corrected. | 


3.6 Hamming Codes with Maple 


In this section we show how Maple can be used to construct codewords and 
correct errors in the [15,11] Hamming code. 


We begin by constructing the parity check matrix H for the code. We 
first enter the length m = 4 of the binary vectors that form the columns in 
the parity check matrix. 


>m := 4: 


Recall that the columns of H are binary expressions of length m for 
the numbers 1,2,...,2™ — 1. We can obtain the binary expression of 
a number in Maple by using the convert command. For example, we can 
obtain the binary expression of the number 4 by entering the following 
command. 


> cb := convert(4, base, 2); 
cb := [0,0,1] 


The entries in the preceding vector are the coefficients in the expression 
4=0- (2°) +0-(2')+1- (27). Note that this vector contains only three 
positions, whereas for the columns in H we want binary vectors of length 
m = 4 positions. That is, to be placed as the 4°” column in H, we would 
want the number 4 to be converted to the binary vector [0, 0, 1, 0]. 
Furthermore, note that the binary digits in this vector are the reverse of 
how they should be expressed in the 4*” column of H. To be directly placed 
as the 4” column in H, the number 4 should be converted to the binary 
vector [0, 1, 0, 0]. We can use the following commands to take care of 
these problems. After first including the Maple linalg package, we define 
the vector bv of length m = 4 containing all zeros. Then in the subsequent 
for loop we place the binary digits from cb in appropriate order in the 
vector bv. 
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> with(linalg): 

> bv := vector(m, 0): 

> for i from 1 to vectdim(cb) do 
> bv[m-iti] := cb[i]: 

> od: 

> evalm(bv) ; 


[0, 1, 0, 0] 


We now construct the parity check matrix H for the [15,11] Hamming 
code by placing properly ordered binary representations of length m = 4 
for the numbers 1, 2,...,2”’—1 as columns in H. To do this, we first create 
an empty list H. We then use a Maple for loop to build the parity check 
matrix column by column in H. The op command that appears in the loop 
allows new columns to be attached to H with the augment command. 

= []: 


> for j from 1 to 2°m-1 do 

> cb := convert(j, base, 2): 

> bv := vector(m, 0): 

> for i from 1 to vectdim(cb) do 
> bv [m-i+1] := cb[il]: 

> od: 

> H := augment (op(H), bv): 

> 


od: 


> evalm(H) ; 


00000001 1 1 1 1 1 1 1 
0 00 1 1 1 10000 1 1 1 1 
0 1 100 1 100 1 1100 1 1 
1 O0 1 0 1 0O 10 1 0O 10O 1 0 1 


Next, we construct a generator matrix G for the [15,11] Hamming code 
by finding a basis for the null space of H over Z> and placing these basis 
vectors as rows in G. To do this, we first find a basis for the null space of 
H using the Maple Nullspace command as follows. 

> nH := Nullspace(H) mod 2; 


nH := {[0, 1,0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 


9 


| 
0, 0 
1, 1, 
1,1 
1,0 


bi 


3 
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The preceding output is a basis for the null space of H expressed as rows 
in a set. Because Maple places a default ordering on the vectors in this set 
(although not necessarily in the order in which they are displayed), each 
basis vector can be retrieved by entering a command like the following. 

> nH[2]; 


[0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0] 
The following command returns the number of vectors in the set nH. 
> nops(nH) ; 
11 
We now form a generator matrix G for the [15, 11] Hamming code by placing 
the vectors in the set nH as rows in G. In the following for loop we build 


the generator matrix row by row in G. Note that we attach new rows to G 
using the Maple stackmatrix command.° 


>G@i:= []: 

> for i from 1 to nops(nH) do 

> G := stackmatrix(op(G), nH[i]): 

> od: 

> evalm(G) ; 
0 1010001 0 00 0 0 1 0 
0 1000001 0 10 0 0 0 0 
100 0 00011 00 0 0 0 0 
1 110 000 0 0 00 0 0 0 0 
10011000 0 00 0 0 0 0 
1 1010010 000 0 0 0 0 
1 10000010 01 0 0 0 0 
100100010 00 0 1 0 0 
1 1 0 1000 10 0 00 0 0 1 
0001 0 001 0 00 it1 0 0 0 
0 10101000 00 0 0 0 0 


3See footnote p. 51 regarding the Maple stackmatrix command. 
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Recall that the codewords in a linear code constructed using a gener- 
ator matrix G are all vectors of the form wG where w is a vector over Zə 
of appropriate length. To see the length of the vectors w for the [15,11] 
Hamming code, we can enter the following command, which returns the 
number of rows in G. 


> rowdim(G) ; 


11 


Hence, the vectors w for the [15,11] Hamming code should contain 11 po- 
sitions. For example, consider the following vector w. 
> w := vector([1, 0, 1, 1, 1, O, 1, 1, 1, 1, 0]): 


In the next command we form the codeword wG that results from w. Note 
that we use the map command to reduce the result over Z2. Note also that 
we use the Maple &* command for matrix multiplication. 


> c := map(x -> x mod 2, evalm(w &* G)); 
c := [0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1] 


We now show how Maple can be used to correct errors in the [15,11] 
Hamming code. Suppose a codeword in the [15,11] Hamming code is trans- 
mitted and we receive the following vector r. 

> r := vector([0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0]): 


To determine if an error exists in this received vector, we compute the 
syndrome of r as follows. 
> syn := map(x -> x mod 2, evalm(H &* r)); 


syn := |1, 0, 0, 1] 


Because this syndrome is nonzero, we know r is not a codeword in the 
[15,11] Hamming code. Recall that to correct the error in r we must only 
find the column in H that matches this syndrome. The number of this 
column in H will be the same as the number of the position in r that 
contains an error. We can use the following commands to find the column 
in H that matches this syndrome. We first assign the parameters fc and 
cn that we will use in the subsequent while loop. The loop compares each 
of the columns in H with the syndrome of r. When a match is found, the 
variable fc is assigned the value 1. This terminates the loop, leaving cn 
as the column number where the match occurred. The col command that 
appears in the loop allows us to access each of the columns of H. The Maple 
equal command is a logical statement that returns true if its parameters 
are equal, and false if not. 

> fc := 
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> while (fc <> 1) and (cn < 2*m-1) do 

> cn := cn + 1; 

> if equal(col(H, cn), syn) = true then 
> fc := 1; 

> fi: 

> od: 


After we execute the preceding commands, we can enter the following com- 
mand to see the position in r that contains an error. 


> cn; 
9 
This value for cn indicates that the error in r is in the 9%” position. To 
correct this error, we first define the following vector e of length 2™ — 1 


containing all zeros. 


> e := vector(2*m-1, 0); 
e := [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] 


Next, we define the entry in position cn of e to be equal to 1. 


> e[cn] := 1: 


We can then see the error vector that corresponds to r as follows. 


> evalm(e); 
[0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0] 


And we can see the corrected codeword as follows. 


> map(x -> x mod 2, evalm(r + e)); 


[0, 1, 1, 1, 1, 1,0, 1,0, 1, 1, 1,1, 1,0] 
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Written Exercises 


1. Using Theorem 3.1, find the maximum number of errors that are 
guaranteed to be correctable in a code of length 7 with 4 codewords. 
Then, using Theorem 3.3, show that it is not possible to construct a 
code of length 7 with 4 codewords that is guaranteed to correct this 
maximum number of errors. 


2. Construct a (15,8) code with 16 codewords. What is the maximum 
number of errors that are guaranteed to be correctable in this code? 


3. Construct an (8,4) code with 16 codewords. What is the maximum 
number of errors that are guaranteed to be correctable in this code? 


4. Is it possible to construct a [6,2] linear code that is 2-error correcting? 
State how you know. (Hint: See Theorem 3.1.) 


5. Find a generator matrix for a 2-error correcting linear code with 4 
codewords. Also, construct a parity check matrix for the code. 


6. Let C be the [7,4] Hamming code. 


(a) Construct the codewords in C. 

(b) Correct the following received vectors in C: rı = (0011101), 
ra = (0100101). 

(c) Make a list of the coset leaders for C and their syndromes. 


7. For the code in Example 3.9, which of the vectors rı = (11101), 
r2 = (01011), and r3 = (10101) can be corrected using the coset 
method? Correct those that can be corrected. For the one(s) that 
cannot be corrected, explain why. 


8. Let W = {(00), (01), (10), (11)}, and choose the following generator 
matrix G. 


1 1 0 O 1 1 
an 1 1 1 


0 0 1 


(a) Construct the linear code C that results from G and W. How 
many errors are guaranteed to be correctable in this code? 


(b 


Ww 


Make a list of the coset leaders for C and their syndromes for 
all coset leaders that contain a single one or all zeros. Why are 
the remaining cosets irrelevant for error correction? 

Which of the vectors rı = (100011), rə = (001100), and 
r3 = (111100) can be corrected in C using the coset method? 
Correct those that can be corrected. For the one(s) that cannot 
be corrected, explain why. 


n 
Qa 
Nw 
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14. 


2: 


. Prove Corollary 3.4. 
. Prove Theorem 3.5. 
. Show that the [2” — 1,2” — 1 — m| Hamming codes are perfect. 
. Prove Theorem 3.9. 


. An equivalence relation is a relation ~ on a set A that satisfies 


(i) ana 

(ii) a~bsSbn~a 

(iii) a~bandb~c>a~c 
for every a,b,c E€ A. Let C be a linear code in Z}. Define a relation 
~on V by z ~ yif x and y are in the same coset of C. Show that ~ 
is an equivalence relation on V. 


A metric space is a set M with a real-valued function d(-,-) on M x M 
that satisfies 

(i) d(x,y) > 0, and d(x,y) = 0 if and only if z = y 

(ii) d(x,y) = d(y,x) 

(ii) dle, z) < dle, y) + d(y,2) 
for every x,y,z € M. Prove or disprove: a code C in Z% with the 
Hamming distance function d(-,-) is a metric space. 


Maple Exercises 


. We mentioned in Section 3.3 that the (32,16) Reed-Muller code was 


used in the Mariner 9 space probe when it returned photographs of 
Mars to Earth in 1972. 


(a) Construct the codewords in the (32,16) Reed-Muller code. How 
many errors are guaranteed to be correctable in this code? 


(b) Correct the following received vector r in the (32,16) Reed- 
Muller code, 


r = (11100101011010011110101101101001) 
Let C be the [31, 26] Hamming code. 


(a) Construct the parity check matrix and a generator matrix for C. 


(b) Construct the codeword in C that results from the following 
vector w. 


w = (10110101110110111110111000) 
(c) Correct the following received vector r in C. 
r = (1101011100110110110101011110111) 
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Chapter 4 


BCH Codes 


The most useful codes we discussed in Chapter 3 were Hamming codes be- 
cause they are linear and perfect. However, Hamming codes are not ideal 
for situations in which the occurrence of more than one error in a single 
codeword is likely. Recall that Hamming codes are only one-error correct- 
ing. If more than one error occurs during the transmission of a Hamming 
codeword, the received vector will not be correctable to the codeword that 
was sent. Furthermore, since Hamming codes are perfect, if more than one 
error occurs during the transmission of a Hamming codeword, the received 
vector will be uniquely correctable — it will just be correctable to the wrong 
codeword. In this chapter we discuss a class of codes called BCH codes that 
are linear and can be constructed to be multiple-error correcting. BCH 
codes are named for their creators Bose, Chaudhuri, and Hocquenghem. 


4.1 Construction of BCH Codes 


One way that BCH codes differ from the codes we discussed in Chapter 3 
is that BCH codewords are polynomials rather than vectors. To construct 
a BCH code, we begin by letting f(z) = 7” — 1 € Z2[x] for some positive 
integer m. Then R = Z2|z]/(f(x)) is a ring that can be represented by all 
polynomials in Z2[x] of degree less than m. Suppose g(x) € Z2[x] divides 
f(x). Then C = {multiples of g(x) in Z2[a] of degree less than m} is a 
vector space in R with dimension m — deg g(x). Hence, the polynomials in 
C form codewords in an [m, m — deg g(x)] linear code in R with 248 9() 
codewords. The polynomial g(x) is called a generator polynomial for the 
code. We consider the codewords in this code to have length m positions 
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because we view each term in a polynomial codeword as a codeword posi- 
tion. A codeword c(x) € Z2[x] with m terms can naturally be expressed as 
a unique vector in Z7 by listing the coefficients of c(x) in order (includ- 
ing coefficients of zero). In this book we will assume BCH codewords are 
transmitted this way with increasing powers of x. 


Example 4.1 Let f(x) =a’ —1 and g(z)=2*+2+1 in Z[z]. Then 
the code C of multiples of g(x) in Z[x] of degree less than 7 has basis 
{oP +x +1, z4 + x? +, £5 +r? +27, 254+24+2°}. Hence, C is a 
[7,4] code with 16 codewords consisting of all linear combinations of 
these basis polynomials in Zəļx]. In this code, we will assume that 
the codeword «°+a4+a%+2 would be transmitted as the vector 
0+ 1g + Ox? +123 + 1r + 1z” + 0z° = (0101110) € ZZ. | 


For a code constructed as described above to be a BCH code, the 
generator polynomial g(x) must be chosen as follows. Let a1, a2,...,@s5 for 
s < m be roots of f(x) with minimum polynomials mı (x), m2(x),...,ms(x) 
in Z2[z], respectively, and let g(x) be the least common multiple of the 
polynomials m;(x) in Zə[x]. Note that g(a) divides f(x), so g(x) can be 
used as the generator polynomial for a code. Choosing g(a) in this manner 
is useful because of how it allows errors to be corrected in the resulting code. 
We will discuss BCH error correction in Section 4.2. Actually, choosing a 
generator polynomial as just described still does not necessarily yield a 
BCH code. For the resulting code to be a BCH code, the values of m and 
the roots a; must be chosen in a special way. We describe this next. 


Let m = 2” — 1 for some positive integer n, and let f(x) = 2” — 1 in 
Zə|z]. Suppose p(x) is a primitive polynomial of degree n in Z [a]. Then 
Z2[x]/(p(x)) is a field of order 2” whose nonzero elements are generated by 
the field element x. For reasons that will become apparent when we begin 
discussing Reed-Solomon codes in Chapter 5, we will denote the element x 
in this field by a. Then, for the roots a; described in the previous paragraph, 
we let a; = a’ for i = 1,...,8. Choosing the a; in this manner is useful 
because of how it allows the generator polynomial g(x) to be determined. 
The polynomials m;(x) described in the previous paragraph are then the 
minimum polynomials of a’ for i= 1,...,s. Thus, we can determine g(x) 
by forming the product that includes a single factor of each unique m;(x). 
As a consequence of Lagrange’s Theorem (Theorem 1.4), af will be a root 
of f(x) for all i. Hence, g(x) will divide f(x). 


Because BCH codewords are in Z2[z], some of the computations that 
are necessary for constructing BCH codes can be done very easily. Specif- 
ically, note that (£1 + £2 +--+ £r)? = a} +23 +- +x? over Z2 since 
all cross terms will contain a factor of 2. Therefore, for a polynomial 
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h(x) = a +a +--+ a € Zola], it follows that 
h(a?) = (a?) + (a7)? +--+ (a?) = (a +a? +--+ +a")? = h(a)’. 


Similarly, it can be seen that h(a?*) = h(a”)? for any positive integer k. 
Thus, for example, h(a!*) = h(a)? = h(a*)*. The utility of this will be 
clear in the following examples. 


Example 4.2 Let f(x) = 2” — 1, and choose the primitive polynomial 
p(x) = x? +x+1 in Z[z]. Then for the element a = x in the field 
Z2[x]/(p(x)) of order 8, we list the field elements that correspond to the 
first seven powers of a in the following table. 


Power Field Element 

a! a 

a? a? 

a? a+1 

at a +a 

až a° +a+1 
aê a? +1 

a” 1 


Let C be the BCH code that results from considering the first four powers 
of a. To determine the generator polynomial g(x) for C, we must find 
the minimum polynomials m(x), me(x), ms(x), and m4(x). But 
since p(x) is primitive and a= xv, it follows that p(a) = 0. Furthermore, 
pla?) = pla)? = 0 and p(a*) = p(a)* = 0 since p(x) € Zo[a]. Thus, 
m(x) = ma(x) = m4(x) = p(x). Now, since a is a root of f(x), the 
minimum polynomial m3(x) of a? must be one of the irreducible factors 
of x7 —1= (x? +g + 1)(x? +x? +1)(x +1). (This factorization can be 
obtained by using the Maple Factor command as illustrated in Section 
4.3.1.) By substituting a? into each of these irreducible factors, we can 
find that z3 +gs?+1 is equal to zero when evaluated at a. Hence, 
m3(x) = 23+27+1. Thus, g(x) = m,(x)ma(x) = 26+a54+a4+034+a7?4+0-+1. 
The code that results from this generator polynomial is a [7,1] BCH code 
with basis {g(x)} and two codewords. | 


Example 4.3 Let f(x) = x! — 1, and choose the primitive polynomial 
p(x) = zf +x +1 in Z,[z]. Then for the element a = x in the field 
Z2[x]/(p(x)) of order 16, we list the field elements that correspond to the 
first 15 powers of a in the following table. 
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Power Field Element Power Field Element 


a! a a? ata 

a? a? at? a° +a+1 
ae a3 gll 34 gta 
at a+1 at? a? +a? +a+1 
a a +a ais a3 +a? +1 
aê a + a? alt a? +1 

a’ a° +a+1 ai} 1 

aè a? +1 


Let C be the BCH code that results from considering the first six powers 
of a. To determine the generator polynomial g(x) for C, we must find the 
minimum polynomials mı (x), mə(x),..., me(x). But p(a) = 0, and hence 
pla?) = plat) = 0. Thus, mi(x) = m(x) = m4(x) = p(x). Also, since a? 
and ař are roots of f(x), then m3(x) and m(x) are irreducible factors of 
x15 —1 = (x + 1)(£? +x +1)(x4+r+1)(z4 +r? +1)(z4 +r? +r? +r+1). 
By substituting a? and ař into each of these irreducible factors, we can 
find that m3(z) = zf + z? +x? +x +1 and m(r) = z? +s +1. For 
thermore, m3(a®°) = m3(a?)? = 0, and hence me(x) = m(x). Thus, 
g(x) = mı(x)m3(x)ms(x) = £1? + z8 + 25 + xt +z? +sr+1. The code 
that results from this generator polynomial is a [15,5] BCH code with basis 
{g(x), xg(x), x? g(x), x°g(x),24g(x)} and 2° = 32 codewords. E 


Although BCH codes are not as easy to construct as Hadamard or 
Reed-Muller codes, BCH codes are linear while Hadamard and Reed-Muller 
codes are not. Also, unlike Hamming codes, BCH codes can be constructed 
to be multiple-error correcting. Specifically, in the next section we will 
show that a BCH code that results from considering the first 2t powers of a 
is t-error correcting. For example, since in Example 4.3 we considered the 
first six powers of a, the resulting BCH code is 3-error correcting. Also, 
since in Example 4.2 we considered the first four powers of a, the resulting 
BCH code is 2-error correcting. We discuss a scheme for correcting errors 
in BCH codewords next. 


4.2 Error Correction in BCH Codes 


As mentioned in Section 4.1, the generator polynomial for a BCH code is 
chosen in a special way because of how it allows errors to be corrected in 
the code. Before discussing the BCH error correction scheme, we first note 
the following theorem. 
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Theorem 4.1 Let C be a BCH code that results from a primitive poly- 
nomial of degree n by considering the first s powers of a, and suppose 
c(x) € Za[a] has degree less than 2” — 1. Then c(x) € C if and only if 
clat) =0 fori =1,...,8. 


Proof. Let m;(x) be the minimum polynomial of a’ in Z2[x] for i = 1,..., s, 
and let g(x) be the least common multiple of the polynomials m;(x) in 
Za{a]. If cla) € C, then c(x) = g(x)h(x) for some h(x) € Zəļz]. Thus, 
c(a') = g(a’)h(a’) = Oh(a‘) = 0 for i = 1,...,s. Conversely, if c(a’) = 0 
fori =1,...,8, then m;(x) divides c(x) for i = 1,...,s. Hence, g(x) divides 
c(x), and c(a) € C. ] 


We now outline the BCH error correction scheme. Let C be a BCH 
code that results from a primitive polynomial of degree n by considering 
the first 2t powers of a. We will show in Theorem 4.2 that C is then t-error 
correcting. Suppose c(x) € C is transmitted and we receive the polynomial 
r(x) Æ c(x) in Z2|z] of degree less than 2” — 1. Then r(x) = c(x) + e(z) 
for some nonzero error polynomial e(x) in Z2|x] of degree less than 2” — 1. 
To correct r(x), we must only determine e(x), for we could then compute 
c(x) = r(x) + e(x). But note that Theorem 4.1 implies r(at) = e(a?) for 
i= 1,...,2t. Thus, by knowing r(x), we also know some information about 
e(x). We will call the values of r(a’) the syndromes of r(x). Now, suppose 


efx) = @™ +e 4---4 0 


for some integer error positions mı < mg < +: < Mp with p < t and 
Mp < 2” — 1. To find these error positions, we begin by computing the 
first 2t syndromes of r(x). We will denote these syndromes as follows by 
T123... T2t. 


m = ria) = ea) = a™ +a™? +- +a™r 
2 2 


) = (a?) + (a2) + + (a?) 


ro = r(a 


roe = r(a*) = e(a2*) = (at) mi + (q2t)ma +... + (a24)™ 


Next, we introduce the following polynomial E(z), which we will call an 
error locator polynomial. 


E(2) 


(z — a™!)(z — a™?) --- (z — a™”?) 


-1 
= 2 +o? + +--+ 0p 


We call E(z) an error locator polynomial because the roots of E(z) show 
the error positions in r(x). Our eventual goal will be to determine these 
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roots. Before doing this, we must first find the coefficients 01, 02,...,0p 
of E(z). These coefficients are the elementary symmetric functions in 
a™ ,a™,...,a™?. That is, 


0i = y am 


1<i<p 
0. = > amg 
1<i<j<p 
= Mea. gi’ 
o = «a a 


Note that if we evaluate E(a™s) for all 1 < j < p and multiply each result 
by (a’)* for any 1 <i < p, since E(a™) = 0 for all 1 < j < p, we obtain 
the following system of equations for 1 < i < p. 


O° = (a™)*[(a™)P I oi(a™) PTD saal Tp] 
jaa (a™2)*[(a™2)P 4 oı(a™2) PD ++ op] 
Or os (a™?)*[(a™ )P 4 gı (a™r) PT) ee ap] 


By distributing the (a™%)f in the preceding equations and summing the 
results, we obtain the following equation for 1 < i < p. 


O = itn + O1Pitp—1 + Oi- +++ + Opri 


Since this holds for 1 < i < p, this yields a system of p linear equations in 


the p unknowns 01,..., 0p that are equivalent to the following single matrix 
equation. 
TL. rae St Tp Op Tp+1 
e (4.1) 
Tp t't T2p—1 O71 T2p 


If the p x p coefficient matrix in (4.1) is nonsingular, then we can solve 
(4.1) uniquely for o1,...,0,. After we find o1,...,0), we can then form 
the error locator polynomial E(z) and determine a™,...,a™” by trial and 
error as the roots of E(z). This reveals the error positions m1,...,mp in 


r(x). 


We will now look at two examples of the BCH error correction scheme 
in the code that results from the generator polynomial in Example 4.3. 
Since we will generally not know the number of errors in a received polyno- 
mial before attempting to correct it, we will begin the BCH error correction 
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scheme in a ¢-error correcting BCH code by assuming that the received poly- 
nomial r(x) contains the maximum number t of correctable errors and using 
the first 2t syndromes of r(x). If r(x) does not contain exactly t errors, then 
the t x t coefficient matrix in (4.1) will not be nonsingular. In this case, 
we can simply reduce the number of assumed errors to t— 1 and repeat the 
error correction procedure using only the first 2t — 2 syndromes of r(x). If 
r(x) also does not contain exactly t — 1 errors (i.e., if the (t — 1) x (t — 1) 
coefficient matrix in (4.1) is also not nonsingular), we can continue to re- 
peat the procedure, each time reducing the number of assumed errors by 
one and using twice as many syndromes as the number of assumed errors 
until the coefficient matrix in (4.1) is nonsingular. If the error in r(x) is 
not correctable, then the coefficient matrix in (4.1) will not be nonsingular 
for any number of assumed errors between 1 and t. 


Example 4.4 Let C be the BCH code in Example 4.3 that results from 
the generator polynomial g(x) = xt? + x8 + x5 +z +£?+x+1. Suppose a 


HL? ++ 
codeword in C is transmitted as a vector in Z3° and we receive the vector 
r = (101111110010000) € 23°. Note first that this vector converts to the 
polynomial r(x) = 1 + £? + x? + zt + £5 + xê +x” +a! € Z[a]. It can 
easily be verified that g(x) does not divide r(x). Hence, r(x) ¢ C. Since 
C is 3-error correcting, to correct r(x) we begin by computing the first six 
syndromes of r(x). Using the table of powers of a and corresponding field 
elements in Example 4.3, we can compute these syndromes as follows. 


rı = r(a) 
= 1 +a? +a? + atta” +a +a +a"? 
Tta ta ta tita tata ta ta tatta tati 


II 


EE 

rg = r(a?) 
sij a? + al? + al’ + al8 + a?! + a3? 
= ital ta ta EL a aE 
E 

rs = r(ař) 


1+ a! + al5 + a?! + a?” + a3? + a35 + a5? 


II 


II 


— all 


Since r(x) € Z2[x], we can find the remaining syndromes as follows. 


ra = ra?) = (r(a))? = (a = a 
ra = rat) = (r(t = (P = a? 
re = (a) = (r(a))? = (a)? = a” 
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Now, assuming r(x) contains three errors, we must find o1, c2, and o3 that 
satisfy the following equation. 


a3 a aê 03 al2 
ab ge a oz | = | a® (4.2) 
gee gle gl = qi2 


It can be verified that the determinant of the 3 x 3 coefficient matrix in 
(4.2) is a!?. Hence, this coefficient matrix is nonsingular and r(x) contains 
three errors. We can use Cramer’s rule to determine 01, 02, and gg. For 
example, since 


qi? aê af 
10 af qi? a28 | a3? | a28 a24 | q?® a28 
12 12 at? 


= 1+a?+a+a? +a’ +a’ +a? +a 


= @+1 
alt, 
Cramer’s rule yields 
14 
g 2 
03 = —5 = a 
al 
Similarly, since 
a? a? aê Ge E E 
aalan |=...=a@! and a a aP | =... =1, 
aê al? a! a® al? al? 
Cramer’s rule yields 
10 
a 
o = `z = a” and o = -z =. 
a a 


The resulting error locator polynomial is E(z) = 23 + az? + al8z + a?. 
By evaluating E'(z) at successive powers of a, we can find that the roots of 
E(z) are 1, a®, and a!*. Hence, the error in r(x) is e(x) = 1 + x3 + x!?. 
Thus, we correct r(x) to the following codeword c(z). 


c(x) = r(x) + elt) = r? +r? + rttr +r tate 


It can easily be verified that this polynomial c(x) is a multiple of g(x). 


Suppose another codeword in C is transmitted and we receive the vec- 
tor r = (100100010011010) € 74°. This vector converts to the polynomial 
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r(x) = 1+ z tal toe +g! +r! € Za[z]. It can easily be verified 
that r(x) is not a multiple of g(x). Hence, r(x) ¢ C. To correct r(x) we 
begin by computing the first six syndromes of r(x). These syndromes can 
be determined as in the first part of this example and are as follows. 


YY = a 
T2 = ql? 
r3 = a? 
r4 = a 
r5 = 1 
re = at 


Now, assuming r(x) contains three errors, we must find o1, c2, and o3 that 
satisfy the following equation. 


a all a2 A aè 
a a a oļ=]|1 (4.3) 
a a 1 C1 at 


However, it can be verified that the determinant of the 3 x 3 coefficient 
matrix in (4.3) is 0. Hence, this coefficient matrix is not nonsingular and 
r(x) does not contain three errors. Thus, we assume r(x) contains only 
two errors and use only the first four syndromes of r(x). Assuming r(x) 
contains only two errors, we must find cı and g2 that satisfy the following 


equation. 
a al a a2 
| al? a2 = ač (4.4) 


It can be verified that the determinant of the 2 x 2 coefficient matrix in (4.4) 
is a!3, so this coefficient matrix is nonsingular and r(x) contains two errors. 
We can again use Cramer’s rule to determine cı and o2. Specifically, since 


et «qe a a a 4 
=r rri § | n = =q 
ač g al? að ; 
Cramer’s rule yields 
3 
a a 
o = =a = a and oa = -z = 
a a 


The resulting error locator polynomial is E(z) = z2? + az + a3. The roots 
of E(z) can be determined as a and a°. Hence, the error in r(x) is 
e(x) = x + x? and we correct r(x) to the following codeword c(z). 


e(z) = r(z)+e(x) = 1+8 +g? +g? +g’ +r +e +a”? 


It can easily be verified that this polynomial c(x) is a multiple of g(x). m 
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We close this section by proving a fundamental result we have men- 
tioned regarding BCH codes. 


Theorem 4.2 Let C be a BCH code that results from considering the first 
2t powers of a. Then C is t-error correcting. 


Proof. Suppose C results from a primitive polynomial of degree n, and let 
m = 2” — 1. Consider the following matrix H. 


1 a a am} 

1 a2 (a?)? (a7) 
H= 1 a? (a8)? (a3) m=i 

I a GE ca (ac 


Note that for a polynomial r(x) = bọ + bia +++» + bm-12™7! € Zo[z], if 
we let b = (bo, b1,- ..,bm—1), then Hb = (r1,r2,...,r2t)*. Hence, r(x) € C 
if and only if Hb! is the zero vector. Thus, H can serve as a parity check 
matrix for C. 


We will now show that the minimum number of linearly dependent 
columns in H is 2t+ 1. We first show that any 2t columns in H must be 
linearly independent. Choose integers 0 < j1 < j2 < +++ < jot < m. Then 
the columns in H in these positions form the following 2t x 2t matrix. 


qi qi2 sae qi2t 
(a) (a) oe Car es 
(aji (ayi .. (a)i 


The determinant of this matrix can be expressed as 


qo qi qJ2t 
-qiiqi... al, 
(añ )2t-1 (ai2)2t-1 shoe (qit)2t-1 


which is nonzero because it is the determinant of a Vandermonde matrix 
with distinct columns. Thus, any 2¢ columns in H are linearly independent. 
Also, since H has 2t rows, then we know that any 2t + 1 columns in H 
must be linearly dependent. Therefore, the minimum number of linearly 
dependent columns in H is 2t+ 1. By Theorem 3.7, we know then that the 
minimum distance of C is 2t + 1. Hence, C is t-error correcting. E 
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4.3 BCH Codes with Maple 


In this section, we show how Maple can be used to construct the BCH gen- 
erator polynomial in Example 4.3, and correct the two received polynomials 
in Example 4.4. 


We begin by including the Maple linalg package and entering the 
primitive polynomial p(x) = 24 +x +1 € Z[2] used to construct the code. 
> with(linalg): 
>p:i=x--> x 4+x+ 1: 
> Primitive(p(x)) mod 2; 
true 


Next, we use the Maple degree function to assign the number of elements in 
the underlying field as the variable fs, and use the Maple vector function 
to create a vector in which to store the field elements. 

> fs := 27 (degree(p(x))); 


fs := 16 
> field := vector(fs); 


field := array(1..16,[ ]) 


By entering the following commands, we generate and store the field ele- 
ments in the vector field. Since for BCH codes we denote the field element 
x by a, we use the parameters a and p(a) in the following Powmod com- 
mand. 


> for i from 1 to fs-1 do 

S field[i] := Powmod(a, i, p(a), a) mod 2: 
> od: 

> field[fs] := 0: 

> evalm(field); 


[a, a°, a°, 6 leat a° +a’, a +a+l, a° +1, aè +a, 
a? +a+ l1, aè +a? +a, aè +a? +a+tl, a? +a? +l, a? +1, 
1, 0] 


Because working with BCH codes requires frequent conversions between 
polynomial field elements and powers of a, it will be very useful for us to 
establish an association between the polynomial field elements and corre- 
sponding powers of a for this field. We will establish this association in a 
table. We first use the Maple table command to create a table. 

> ftable := table(): 


Then, by entering the following commands we establish an association be- 
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tween the polynomial field elements and corresponding powers of a for this 
field in ftable. Note the bracket [ ] syntax for accessing the field elements 
and the table entries. 

> for i from 1 to fs-1 do 

> ftable[ field[i] ] := a^i: 

> od: 

> ftable[ field[fs] ] := 0: 
We can view the entries in ftable by entering the following print com- 
mand. For the sake of space (because Maple displays this table as a vertical 
list) we have removed this output. 

> print (ftable); 


The following command illustrates how ftable can be used. Specifically, 
by entering the following command we can access the power of a that cor- 
responds to the polynomial field element a? + a?. 

> ftable[a*3 + a°2]; 


4.3.1 Construction of the Generator Polynomial 


We now show how Maple can be used to construct the generator polynomial 
in Example 4.3. We first define the polynomial f(x) = x!° — 1 of which 
each power of a is a root. 

> f := x -> x*(fs-1) - 1; 


f= > a) 1 


Next, we use the Maple Factor command to find the irreducible factors of 
f(x) in Zafz]. 

> factf := Factor(f(x)) mod 2; 

factf := (2f +2? +2? +24+1)(2t+2+1)(2? +241) 

(x+1)(at+2°+1) 

To construct the generator polynomial in Example 4.3, we will need to 
access the factors of f(x) separately. We can do this by using the Maple 
op command. For example, we can use the following command to assign 


the third factor in the expression factf to the variable f3. 
> £3 := op(3, factf); 


fo :=a7? +e41 
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We can then use the Maple unapply command as follows to convert £3 
into a function that can be evaluated in the usual manner. 
> £3 := unapply(f3, x); 


f3 :=£r >r rtl 
> f3(a^6); 


a tafti 


The following Maple Rem command returns the polynomial field element 
that corresponds to the preceding output. 
> Rem(f3(a^6), p(a), a) mod 2; 


a 


And the following command returns the number of factors in factf. 
> nops(factf); 


5 


We now find the minimum polynomials that will be the factors in the gen- 
erator polynomial. We first assign the number t = 3 of errors the code 
is to be able to correct. In the subsequent loops we find and display the 
minimum polynomials of a,a?,...,a?*. In these commands, the outer loop 
spans the powers a’ of a, while the inner loop evaluates each factor of factf 
at a’. Since each power of a is a root of factf, each power of a will be 
a root of an irreducible factor of factf. The factor of which each power 
of a is a root will be the minimum polynomial of that power of a. The 
if and print statements that appear in these commands cause the correct 
minimum polynomial to be displayed. The break statement causes the 
inner loop to terminate when the correct minimum polynomial is found. 
>t := 3: 


> for i from 1 to 2*t do 


> for j from 1 to nops(factf) do 

> fj := op(j, factf): 

> fj := unapply(fj, x): 

> if Rem(fj(a*i), p(a), a) mod 2 = 0 then 

> print(a7i, ‘ is a root of ‘, £j(x)): 
> break: 

> fi: 

> od: 

> od: 
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a, isarootof ,a++a+1 


a°, isarootof ,#*+a+1 

a, is a root of eo pe fat te l 
at, isarootof ,a*+a+4+1 
a, isarootof ,#+a+4+1 

a, is a root of fet ea? tae? he l 


Next, we define one factor of each of the three unique minimum polynomials 
in the preceding output. 


> mí := x -> x^4 +x+i1: 
> m3 := x -> x^4 + x73 + x72 +x+i: 
> m5 := x -> x^°2 + x + 1: 


We can then define the generator polynomial g(x) as the product of these 
three factors. 


> g := mi(x) * m3(x) * m5(x); 
g:= (aftr +1) (2f +r? +r +241)(2? +241) 


Finally, we convert g(x) into a function that can be evaluated in the usual 
manner as follows. 


> g := unapply(g, x); 


g:=x£ > (xf +r+1)(zf +r ta? 4+e41) (2%? +241) 


4.3.2 Error Correction 


We now show how Maple can be used to correct the received polynomials 
in Example 4.4. Consider first the following received polynomial r(x). 


> r := x -> 1+ x72 + x73 + x74 + x75 + x76 + x77 + x710: 


Recall that to correct r(x) we begin by computing the first 2t syndromes 
of r(x). Before doing this, we first create a vector syn of length 2t in which 
to store the syndromes. 


> syn := vector(2x*t) ; 


syn := array(1..6,[ ]) 
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By entering the following commands we generate and store the first 2t 
syndromes of r(x) in syn. 


> for i from 1 to 2*t do 


> syn[i] := Rem(r(a*i), p(a), a) mod 2: 
> syn[i] := ftable[ syn[il] 1]; 

> od: 

> evalm(syn) ; 


We can then access particular syndromes of r(x) from the vector syn. For 
example, we can access the 5” syndrome of r(x) by entering the following 
command. 


> syn[5]; 


at? 


Next, we define the 3 x 3 coefficient matrix from (4.2) as follows. 
> A := matrix( [ [syn[1], syn[2], syn[3]], [syn[2], syn[3], 
> syn[4]], [syn[3], syn[4], syn[5]] ] ); 


And next we define the vector from the right-hand side of (4.2). 
> b := vector( [syn[4], syn[5], syn[6]] ); 


b := [a!?, a}, a2] 
We can use the Maple det function to find the determinant of A as follows. 
> d := det(A); 


d:= al? — a? — a??? + 2a?t — a!’ 


The next command returns the field element that corresponds to the de- 
terminant of A. 


> d := Rem(det(A), p(a), a) mod 2; 


d:=a8 +a? +a+1 
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And the next command returns the power of a that corresponds to the 
determinant of A. 


> d := ftable[d]; 
d:=a' 


Since this determinant is nonzero, we know r(x) contains three errors. As 
in Example 4.4, we will use Cramer’s rule to determine 01, 02, and o3 from 
(4.2). To construct the matrices required for Cramer’s rule, we will use the 
Maple col function for choosing a column from a matrix. For example, the 
following command returns the second column from A. 


> col(A, 2); 


We can easily use the Maple col and augment functions to construct the 
matrices required for Cramer’s rule. For example, the following command 
constructs a new matrix A1 from A by replacing the first column of A with 
the vector b. This matrix is necessary for Cramer’s rule in computing 03. 


> Al := augment(b, col(A,2), col(A,3)); 


al? a aê 
Al := | al aê qa! 
qi? g2 aql 


The following command returns the determinant of A1 as a power of a. 


> dA1 := ftable[ Rem(det(A1), p(a), a) mod 2 ]; 
dA1 := a!f 
Hence, by Cramer’s Rule, we can compute a3 as follows. 


> sigma3 := dA1/d; 


o3:= a? 


Similarly, by Cramer’s Rule, we can determine g2 as follows. 
> AQ := augment(col(A,1), b, col(A,3)); 


a a a 
A2 := | aê a a2 
go right: gle 
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> dA2 := ftable[ Rem( det(A2), p(a), a) mod 2 ]; 


dA? := a! 


> sigma2 := dA2/d; 


623 
42 


Because this expression for g2 in not in the exact form we want (as a positive 
power of a), we enter the following command. 
> sigma2 := sigma2 * a^(fs-1); 


Finally, by Cramer’s Rule we can determine g; as follows. 
> A3 := augment(col(A,1), col(A,2), b); 


> dA3 := ftable[ Rem(det(A3), p(a), a) mod 2 ]; 


dA3 := a" 
> sigmal := dA3/d; 


ol:=a 


Next, we define the resulting error locator polynomial. 
> EL := 273 + sigmal*z”2 + sigma2*z + sigma3; 


BL:=234+02+a%24+a@ 


Vv 

sz 

i 
Il 


unapply (EL, z); 
EL := z => 2? +a? 22 +a” z +a? 


By entering the following commands we find the roots of this error locator 
polynomial by trial and error. 
> for i from 1 to fs-1 do 


> if Rem(EL(a*i), p(a), a) mod 2 = O then 

> print(a*i, ‘ is a root of EL(z) = ‘, EL(z)); 
> fi: 

> od: 
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a, isarootof EL(z) = ,2+0°2?+a%z+? 


, isarootof EI(z) = ,z22 +022? +a%z+07 


a, isa root of EL(z) = ,2+e%2t+a®zid? 


Recall, a! = 1. Hence, the roots of this error locator polynomial are 1, až, 
and a!?. Thus, the following polynomial is the error e(x) = 1 + £ + x!? in 
the received polynomial r(x). 


Se ora K Te tex" 12: 


The next command returns the corrected codeword. 


> c := (r(x) + e(x)) mod 2; 
c= r? +r? Hattar te’ Hr H l? 
And the next command verifies that c(x) is a multiple of g(x). 
> (Factor(c) mod 2)/g(x); 


a 


We now consider the following polynomial r(x), which is the second 
received polynomial from Example 4.4. 


>r:= x -> 1+ x73 + x77 + x710 + x711 + x713: 
To correct this received polynomial, we begin as follows by computing and 
storing the first 2t syndromes of r(x) in the vector syn. 


> for i from 1 to 2*t do 


> syn[i] := Rem(r(a*i), p(a), a) mod 2: 
S syn[i] := ftable[ syn[i] ]; 

> od: 

> evalm(syn); 


Next, we define the coefficient matrix and vector from (4.3). 
> A := matrix( [ [syn[1], syn[2], syn[3]], [syn[2], syn[3], 
> syn[4]], [syn[3], syn[4], syn[5]] ] ); 


a a a 
A:= | al? a? ač 
a a ab 
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> b := vector( [syn[4], syn[5], syn[6]] ); 


b := [a®, al, at] 


2 


We find the determinant of the coefficient matrix from (4.3) as follows. 
> d := Rem(det(A), p(a), a) mod 2; 


d:=0 


Since this determinant is zero, we know r(x) does not contain exactly three 
errors. Thus, we assume r(x) contains only two errors and define the co- 
efficient matrix and vector from (4.4). To create the coefficient matrix, we 
first use the Maple delrows command to delete the last row from A. 


> A := delrows(A, 3..3); 


Note that the last column of this new matrix A is the vector on the right- 
hand side of (4.4). We define this vector next. 


> b := col(A, 3); 
b := [a’, aè] 


Then, by deleting the last column from the new matrix A as follows, we 
obtain the coefficient matrix from (4.4). 


> A := delcols(A, 3..3); 


Next, we find the determinant of this 2 x 2 coefficient matrix. 
> d := ftable[ Rem(det(A), p(a), a) mod 2 ]; 


d := at’ 


Since this determinant is nonzero, we know r(x) contains two errors. We 
again use Cramer’s rule to determine o and o2 from (4.4). 


> Al := augment(b, col(A,2)); 


Al:= 
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> dA1 := ftable[ Rem(det(A1), p(a), a) mod 2 ]; 
dAl:=a 
> sigma2 := dAl/d; 


> sigma2 := sigma2 * a^(fs-1); 
o2 := a’ 


> A2 := augment(col(A,1), b); 


A2 = 


> dA2 := ftable[ Rem(det(A2), p(a), a) mod 2 ]; 


dA2 := a? 
> sigmal := dA2/d; 
1 
col := g10 


> sigmal := sigma1 * a^(fs-1); 


Next, we define the resulting error locator polynomial. 
> EL := z^°2 + sigmail*z + sigma2; 


EL := z2? +a z +a 
> EL := unapply (EL, z); 
EL:=z => 2 +a z+a’ 


We find the roots of this error locator polynomial as follows. 


> for i from 1 to fs-1 do 


> if Rem(EL(a*i), p(a), a) mod 2 = O then 
> print(a*i, ‘ is a root of EL(z) = ‘, EL(z)); 
> fi: 
> od: 
a, isarootof EL(z) = ‚2? +a z +a’ 
a°, isarootof EL(z) = ,z2? +a? z +a? 
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Thus, the following polynomial is the error e(x) = x + x? in r(x). 


>e 2] 2 > & +R": 
The next command returns the corrected codeword. 
> c := (r(x) + e(x)) mod 2; 


celta +r +e tet H HarHar 


We can use the Maple sort command as follows to sort the terms in the 
preceding polynomial. 


> sort(c); 


g8tett gta? +e3 4a? 4+e¢41 


The next command verifies that c(x) is a multiple of g(x). 
> (Factor(c) mod 2)/g(x); 


(2? +e+1)(2+1) 


© 1999 by CRC Press LLC 


Written Exercises 


1. Use the primitive polynomial p(x) = £? + x? +1 € Zə[x] to construct 
a generator polynomial for a one-error correcting BCH code. State 
the parameters [n, k] for the code. 


2. Correct the following received vectors in the BCH code that results 
from the generator polynomial in Written Exercise 1. 


(a) r = (1101110) 
(b) r = (1100010) 


3. Use the primitive polynomial p(x) = x? +x? +1 € Z2[x] to construct 
a generator polynomial for a 2-error correcting BCH code. State the 
parameters [n, k] for the code. How does this code compare to the 
code that results from the generator polynomial in Example 4.2? 


4. Use the primitive polynomial p(x) = x? + x? +1 € Ze[z] to construct 
a generator polynomial for a 3-error correcting BCH code. State the 
parameters |n, k] for the code. How does this code compare to the 
code that results from the generator polynomial in Written Exercise 
3? Can you make any additional conclusions about the code that 
results from the generator polynomial in Written Exercise 3? 


5. Use the primitive polynomial p(x) = zf + x +1 € Z2[x] to construct 
a generator polynomial for a [15,7] BCH code. State the number of 
codewords in the code and the number of errors that can be corrected 
in the code. How does this code compare to the code that results from 
the generator polynomial in Example 4.3? 


6. Use the primitive polynomial p(x) = 2+ + x? +1 € Zə[x] to construct 
a generator polynomial for a 2-error correcting BCH code. State the 
code’s parameters [n, k] and the number of codewords in the code. 


7. Use the primitive polynomial p(x) = z4 + x? +1 € Z2[x] to construct 
a generator polynomial for a 3-error correcting BCH code. State the 
code’s parameters |n, k] and the number of codewords in the code. 


8. Correct the following received vectors in the BCH code that results 
from the generator polynomial in Example 4.3. 


(a) r = (100011011001010) 
(b) r = (011111010011010) 
(c) r = (101000011101100) 
(d) r = (111011001010100) 
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9. The irreducible factors of f(x) = x?! — 1 € Za[a] are (x + 1) and six 
primitive polynomials of degree 5. To construct a t-error correcting 
BCH code with binary codewords of length 31 positions, we begin 
with a primitive polynomial p(x) € Z[x] of degree 5, and a certain 
number of powers of a = x in the field of order 32 resulting from p(x). 
This determines a generator polynomial for the code, the number of 
basis elements in the code, and the number of codewords in the code. 
Complete the following table for a BCH code with binary codewords 
of length 31 positions. 


Number of | Number of | Degree of Number Number of 
Correctable | Powers of Generator of Basis Codewords 
Errors a Needed Polynomial | Elements 

3 

4 

5 

6 


Maple Exercises 


. Find a primitive polynomial of degree 5 in Z|], and use this polyno- 
mial to construct a generator polynomial for a BCH code. State the 
parameters |n, k] for the code, the number of codewords in the code, 
and the number of errors that can be corrected in the code. 


Use the primitive polynomial p(x) = xê + xë +1 € Ze[z] to construct 
a generator polynomial for a 3-error correcting BCH code. State the 
code’s parameters |n, k] and the number of codewords in the code. 


Correct the following received polynomials in the BCH code that re- 
sults from the generator polynomial in Maple Exercise 2. 


(a) r(x) = l1+r+ r2 4 x? ES xt + ge i3 4 gid + 6 4 git 
gp TB te gO PD) neh as pO we OE 

(b) r(x) = ltaeta?te? +r +r? Hr 4 oP + a9 +o 
4 gl? p glt vies p gl p g0 p gl g2 p 2 


LÌ + x4 tg? + x! +p xlt + t8 p rl? 4 720 4 gr? 


© 1999 by CRC Press LLC 


Chapter 5 


Reed-Solomon Codes 


In this chapter we discuss a class of codes called Reed-Solomon codes. These 
codes, like BCH codes, have polynomial codewords, are linear, and can be 
constructed to be multiple-error correcting. However, Reed-Solomon codes 
are significantly more popular than BCH codes and all other types of codes 
because they are uniquely ideal for correcting error bursts. A received vec- 
tor is said to contain an error burst if it contains several errors very close 
together. There are many situations in which transmission errors in binary 
vectors occur naturally in bursts. We describe one such situation, an ap- 
plication of a Reed-Solomon code in the Voyager 2 satellite that returned 
photographs to Earth of several of the planets in our solar system, in Sec- 
tion 5.6. Reed-Solomon codes also have numerous other applications. For 
example, another extensive and well-known use of Reed-Solomon codes is 
in the encoding of music, software, and other information on compact discs. 


5.1 Construction of Reed-Solomon Codes 


To construct a Reed-Solomon code, we begin as in the construction of a 
BCH code by choosing a primitive polynomial p(x) of degree n in Z2[x] and 
forming the field F = Z[2]/(p(x)) of order 2”. We will again denote the 
element x in this field by a. Reed-Solomon codewords, like BCH codewords, 
are then polynomials of degree less than 2” — 1. However, unlike BCH 
codewords which are elements in Z2[x], Reed-Solomon codewords are in 
F |x]. To construct a t-error correcting Reed-Solomon code C, we use the 
following generator polynomial g(x) € F[z]. 


g(x) = (w—a)(x—a*)--- (x — a”) 
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The codewords in C are then all multiples 6(a) g(x) of degree less than 2"—1 
with b(x) € Fla]. Theorem 4.2 can easily be modified to show that C is 
t-error correcting. The codewords in C have length 2” — 1 positions and 
form a vector space with dimension 2” — 1 — 2t. We will use the notation 
RS(2" — 1,t) to represent a t-error correcting Reed-Solomon code with 
codewords of length 2” — 1 positions. 


Example 5.1 Choose primitive polynomial p(x) = 24+a+1 € Z2[z]. The 
nonzero elements in F = Z{2]/(p(a)) are listed in the table in Example 
4.3. Using this field F, we obtain the following generator polynomial g(x) 
for an RS(15,2) code C. 


glz) = (x—a)(x—a*)(« —a*)(x — a’) 


ct + alr’ + afr? + a?r + a! 


To construct one of the codewords in C, consider b(x) = ax? € F[z]. 
Then 
b(x) g(x) = a et + axt? + axl! + ar! + gx? 


is one of the codewords in C. | 


The fact that Reed-Solomon codewords are in F'[z] causes two problems 
we must address. First, unlike BCH codewords, Reed-Solomon codewords 
cannot be transmitted as binary vectors by simply listing the coefficients. 
Despite this, Reed-Solomon codewords are still transmitted as binary vec- 
tors. We will discuss transmission of Reed-Solomon codewords in Section 
5.4. The other problem we must address is that of error correction, for 
the BCH error correction scheme cannot be used to correct errors in Reed- 
Solomon codewords. Actually, applying the BCH error correction scheme to 
a received Reed-Solomon polynomial yields the same information as when 
it is applied to a received BCH polynomial. Recall that the last step in the 
BCH error correction scheme involves finding the roots of the error locator 
polynomial. This reveals only the error positions in the received polyno- 
mial. However, because there are only two possible coefficients for each 
term in a BCH polynomial, knowledge of the error positions is all that is 
necessary to correct the polynomial. The BCH error correction scheme can 
also be used to find the error positions in a received Reed-Solomon polyno- 
mial, but because there is more than one possible coefficient for each term 
in a Reed-Solomon polynomial, knowledge of the error positions is gener- 
ally not enough to correct the polynomial. The specific error in each error 
position must also be determined. Hence, we must present a new error cor- 
rection scheme for correcting Reed-Solomon polynomials. We discuss this 
error correction scheme next. 


© 1999 by CRC Press LLC 


5.2 Error Correction in Reed-Solomon Codes 


Before stating the Reed-Solomon error correction scheme, we first note the 
following analogue to Theorem 4.1. 


Theorem 5.1 Let F be a field of order 2”, and let C be an RS(2” — 1,t) 
code in F[x]. Suppose c(x) € F|x] has degree less than 2” — 1. Then 


c(x) € C if and only if c(a) =0 fori =1,...,2t. 
Proof. Exercise. E 


Theorem 5.1 is useful for error correction in Reed-Solomon codes in 
the same way Theorem 4.1 is useful for error correction in BCH codes. 
Specifically, let F be a field of order 2”, and let C be an RS(2” — 1, t) code 
in F[z]. Suppose c(x) € C is transmitted and we receive the polynomial 
r(x) # c(x) in F[a] of degree less than 2” — 1. Then r(x) = c(x) + e(z) 
for some nonzero error polynomial e(x) in F'[a] of degree less than 2” — 1. 
To correct r(x) we must only determine e(x), for we could then compute 
c(x) = r(x) + e(x). But note that Theorem 5.1 implies r(a*) = e(a?) for 
i= 1,...,2t. Thus, by knowing r(x), we also know some information about 
e(x). We will again call the values of r(a‘) the syndromes of r(x). 


We now outline the Reed-Solomon error correction scheme. As we will 
show, this error correction scheme is only slightly more computationally 
intensive than the BCH error correction scheme. However, because veri- 
fication of the Reed-Solomon error correction scheme is significantly more 
involved than verification of the BCH error correction scheme, we will not 
verify the Reed-Solomon error-correction scheme in this section. In this 
section we will summarize and illustrate the Reed-Solomon error correction 
scheme. We will then verify the Reed-Solomon error correction scheme in 
Section 5.3. 


Let F be a field of order 2”, and let C be an RS(2” — 1,t) code in 
Fa]. Suppose c(x) € C is transmitted and we receive the polynomial 
r(x) = c(#)+e(x) for some nonzero error polynomial e(x) in F'[a] of degree 
less than 2” — 1. We can use the following steps to determine e(x). 


1. We first compute the first 2t syndromes of r(x), which we will denote 
by Sı = r(a), S2 = r(a?),..., So = r(a?*), and form the following 
syndrome polynomial S(z). 


S(z) = S1 + S22 + S32? +- + Saz”! 
(Note: Because r(x) is not necessarily in Z[z], it will not necessarily 


be the case that r(a?*) = r(a*)? for any integer k.) 
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2. Next, we construct the Euclidean algorithm table (see Section 1.6) for 
the polynomials a(z) = z2% and b(z) = S(z) in F[z], stopping at the 
first row j for which deg(r;) < t. (The U column may be excluded 
from this table.) Let R(z) =r; and V(z) = vj. 


3. We can then find the error positions in r(x) by finding the roots of 
V(z). Specifically, if a’",a’?,...,a’* are the roots of V(z), then r(x) 
contains errors in positions x~",2~”,...,a7"*. Finally, we must 
find the coefficients of e(x) at these error positions. Let e_; be the 

a 

coefficient of the z~* term in e(x). Then e_; = RAN. 

V’ (at) 

We will illustrate this error correction scheme first in a BCH code by 
correcting the first received vector in Example 4.4. Although the BCH error 
correction scheme is not sufficient to correct errors in Reed-Solomon code- 
words, the Reed-Solomon error correction scheme can be used to correct 
errors in BCH codewords. 


Example 5.2 Let C be the BCH code that results from the generator 
polynomial in Example 4.3. Suppose a codeword c(x) € C is transmitted 
and we receive the polynomial r(x) = 1+2?+a?+a4+2°+2%+a7+a!. In 
this example we will correct r(x) using the Reed-Solomon error correction 
scheme. Since C is 3-error correcting, we begin by computing the first six 
syndromes of r(x). These syndromes were computed in Example 4.4 and 
are as follows. 


Sı = rla) = ia? 
S2 = r(@) = aê 
Sy = r(aè) = aê 
Sa = r(af) = a”? 
S5 = r(a) = a! 
Se = raf) = a”? 


These syndromes yield the following syndrome polynomial S(z). 


S(z) = a? +.a°z+ afz? +a?’ + a124 + a2 


Constructing the Euclidean algorithm table for a(z) = zê and b(z) = S(z) 
(with numerous necessary calculations omitted), we obtain the following. 


Row Q R vV 
—1 — 26 0 
0 — S(z 1 
1 azta a' 24 + gl z3 4 2? 4 glOz 4 at az +a 
2 z al z3 + atz? +a”?z +a’ az? +az+1 
3 az+a> atz? + a? atz? + 22 + ařz +a? 
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Note that we have not included the U column in this table, and that we 
have stopped at the first row for which the degree of the entry in the R 
column is less than the number of errors that can be corrected in C. Next, 
we let R(z) = atz? + að and V(z) = afz? + 2? + ařz + a?. By evaluating 
V (z) at successive powers of a, we can find that the roots of V (z) are a, 
a?, and a°. The positions in r(x) that contain errors are x°, x7? = «1?, 
and x710 = x5. To determine the coefficients of these terms in the error 
polynomial, we first note that V’(z) = a*z? +a. We can then determine 


the coefficients of the terms in the error polynomial as follows. 


R(a°) at + až 

€o => = = 1 
V'(a°) at + aï 
R(at?) a24 Nes a? 

€5 = = SS = 1 
V'(a19) a24 + aï 
R(a?) al? + aë 

€12 = = aac ea = 1 
V"(a3) ql + ad 


Hence, the error in r(x) can be expressed as the error polynomial 
e(z) =1-2°+1-a2°+1-¢eP=14a2° +r”. | 


Although the error correction procedure in Example 5.2 appears less 
involved than the procedure used to correct the same received polynomial 
in Example 4.4, the procedure in Example 5.2 is actually more involved due 
to the many calculations necessary in constructing the Euclidean algorithm 
table. Also, Example 5.2 shows a relatively simple example of the Reed- 
Solomon error correction scheme because it is applied to a polynomial in 
Za{x]. If the scheme is applied to a more general polynomial, the process 
can be even more involved. We illustrate this next. 


Example 5.3 Consider the primitive polynomial p(x) = xf + z? +1 in 
Z2|a]. For the element a = « in the field F = Za[x]/(p(x)) of order 16, 
we list the field elements that correspond to the first 15 powers of a in the 
following table. 


Power Field Element Power Field Element 
a! a a? a? +1 
a? a? at? aè +a 
a® aè ait a? +a? +1 
at a® +1 al? a+1 
a a° +a+1 ais ata 
aê a? +a? +a+1 alt a® + a? 
a’ a° +a+1 ai} 1 
aè a +a? +a 
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Let C be the RS(15,3) code that results from p(x). The following is the 
generator polynomial g(x) for C. 


g(t) = (x—a)(a—a’)(a— a*)(a — a*)(a — a)(a — a°) 


LÊ + alg? + x4 + a?r’ +a r? + ale + a? 

To construct one of the codewords in C, consider the polynomial 
W(x) = aSr + aba” + ax + a?r? + at + a3a? + afr? + ae + a7 

in F[a]. Then 


cz) = b(z)g(x) 


aSr! + a?r! + axt? + az!!! +a? xr! + ax? + asz? 


3 6 atz’ | atri | ax? | al? r? | ally qi3 


t a?r’ Fa 


is one of the codewords in C. Suppose c(x) is transmitted and we receive 
the following polynomial r(x). 


814 3,,12 


12,1 ii Da 
r(x) = a+ ae +08! + 02s +02 


x + ax? + a?r? 


tara’ + a!r? +a 


Note that r(x) contains errors in the z8, x’, and zê positions. To correct 


r(x), since C is 3-error correcting, we begin by computing the first six 
syndromes of r(x). We list these syndromes below. 


Sı = ria) = = = 0 
So = ala?) = ee = 0 
S3 = rœ) = = a 
S4 = rla) = a 
S5 = rl) = = = 1 
Se = alate = = a 


These syndromes yield the following syndrome polynomial $(z). 
S(z) = ez? +29+274+ a2 


Constructing the Euclidean algorithm table for a(z) = zê and b(z) = S(z) 
(again with numerous calculations omitted), we obtain the following. 


Row Q R vV 
—1 — z8 0 
0 — S(z) 1 
1 az+a? az +a? +82? a? z + aê 
2 az + af a*z3 + az? a82? +a3z+a 
3 aez +a a'z? altz’ + gz? + az + að 
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Hence, R(z) = az? and V(z) = atz? + a!9z? + az + až. By evaluating 
V(z) at successive powers of a, we can find that the roots of V(z) are a’, 


a®, and a?. Thus, the positions in r(x) that contain errors are £77 = 2°, 
x78 = x’, and «~° = xê. To determine the coefficients of these terms in 
the error polynomial, we first note that V’(z) = a112? + a. We can then 


determine the coefficients of the terms in the error polynomial as follows. 


R(a’) a25 10 
€6 = = Ea e E = — = aë 
V' (a?) a29 + a3 az 
R(a®) a23 a8 ; 
€7 = = = = = a 
V’ (a8) a2? + a8 aš 
Rla' 21 6 
eg = (a ) = a = as = ač 
v'(a’) a> + a3 a 


Therefore, the error in r(x) can be expressed as the error polynomial 
e(x) = a a*® +aĉx" +a8x®. It can easily be verified that forming r(x) + e(z) 
yields the codeword c(x). | 


5.3 Proof of Reed-Solomon Error Correction 


In this section we verify the Reed-Solomon error correction scheme sum- 
marized and illustrated in Section 5.2. As we mentioned in Section 5.2, 
this verification is extensive, so the reader may wish to postpone this sec- 
tion until completing the remainder of this chapter, or skip this section 
altogether. 


Let F be a field of order 2”, and let C be an RS(2” — 1,t) code in 
Fa]. Suppose c(x) € C is transmitted and we receive the polynomial 
r(x) = c(#)+e(a) for some nonzero error polynomial e(x) in F'[z] of degree 


ml 
less than 2” — 1. We will denote this error polynomial by e(x) = y ej x) 
j=0 


with m = 2” — 1 and coefficients e; E€ F. To determine e(x) from r(x), 
we begin by computing the first 2t syndromes of r(x). We denote these 
syndromes as follows. 


= 


m— 


Si = ra’) = e(a) = 5 eja for i=1,...,2t 


I= 


Next, we use these syndromes to construct the syndrome polynomial 
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2t—1 


S(z) = 5 S412". Note then that 


i=0 
2t—1lm-1 m—-1 2t—1 
S(z) = J J egal Di zt = J eja J az": 
i=0 j=0 j=0 i=0 


Let M be the set of integers that correspond to the error positions in r(x). 
That is, let M = {j < m — 1 | e; # 0}. Note also that 


A he if | — git) z2t 
S = a) ij zi a S 
(z) 5 eja 3 az 5 eja ( TEF ) 


II 


jEM jEM 
eja ejad(2t+1) z2t 
7 2: l— aiz 2 l— aiz 
jEM jEM 
Hence, for the polynomials 

R(z) = 5 eja! II (1— atz), 

jEM iM 

tAj 

U(z) = 5 eja) 2t+1) II (1 — az), and 

jEM iEM 

i$j 

V(z) = II (1— atz), 

ieM 


it follows that 
R(z) Uz?" 


OS ey eal 


or, equivalently, 

U(z)z7* + V(z)S(z) = R(z). 
This last equation is called the fundamental equation. In this equation, V (z) 
is called the error locator polynomial, R(z) is called the error evaluator 
polynomial, and U(z) is called the error coevaluator polynomial. Note that 
this error locator polynomial V(z) is not the same as the error locator 
polynomial from the BCH error correction scheme. Note also that 


(U(z),V(z)) = (R(2),V(2)) = 1. 


We now consider how to determine the error locator, error evaluator, 
and error coevaluator polynomials. As we will show, these polynomials are 
the entries in the Euclidean algorithm table for a(z) = z% and b(z) = S(z) 
in the first row j for which deg(r;) < t. The following results verify this. For 
convenience, in these results we suppress the variable z whenever possible. 
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Theorem 5.2 Suppose VS + Uz! = R for some syndrome polynomial S, 
and let Vo, Uo, and Ro be polynomials that satisfy 


VoS + Uoz” = Ro; deg(Vo) < t, deg(Up) < t, deg(Ro) < t. 


Then there exists a polynomial h € F|z] such that Vo = hV, Uo = hU, and 
Ro = hR. If it is also true that (Vo, Uo) = 1, then h is constant. 


Proof. Note first that since VS + Uz* = R and VS + Upz** = Ro, 
it follows that 
VoVS+VoU2" = WR 


and 
VVS + VUoz* = VRo. 


Hence, by subtraction, 
(VU — VUo)z* = WR — V Ro. 


By a degree argument we can see that both sides of the preceding equation 
must be equal to 0. Thus, 


VoU — VUo = VR- VRo = 0. 


Since (V,U) = 1, then there must exist polynomials a, 8 € F[z] for which 
aV + BU = 1. Therefore, 


Voa V + VoU = Vo. 
But since VoU = V Uo, then 

VaV +VBU5 = Vo, 
or, equivalently, 

(Va +Uob)V = Vo. 


Now, let 
h = Voa + Uo. 


Then AV = Vọ. Also, hVU = VOU = VUp implies hU = Up, and 
hVR = VWýR = V Ro implies hR = Rọ. Finally, since h must divide 
Vo and Up, if (Vo, Uo) = 1, then h must be constant. E 


Theorem 5.3 Suppose a = z% and b = S for some syndrome polynomial 


S. In the Euclidean algorithm table for a and b, let j be the first row for 
which deg(r;) < t. Define Ro = rj, Uo = uj, and Vo = vj. Then Ro, Uo, 
and Vo satisfy all of the conditions in Theorem 5.2. 
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Proof. From Equation (1.5) we know that rj = ujz*! + vj;9. Hence, 
Ro = Uoz™ + VoS. Furthermore, since Ro = rj and deg(r;) < t, we 
know that deg(Ro) < t. Now, because deg(vj—1) < deg(v;) = deg(Vo) and 
deg(rj;—1) > deg(r;) = deg( Ro), it follows that deg(v;_1Ro) < deg(Vor;_1). 
But by Equation (1.7), we know that Rovj—1 — rj-1Vo = a = 27". Thus, 
deg(r;-1Vo) < 2t, and since deg(r;_1) > t, it follows that deg(Vo) < t. 
Also, since deg(u;-1) < deg(u;) = deg(Uo), it must be the case that 
deg(Rou;-1) < deg(rj;-1Uo). But by Equation (1.6) we know that 
Rouj-1 — rj-1Uo = b = S. Therefore, deg(Upr;-1) < 2t, and since 
deg(r;—1) > t, it follows that deg(Up) < t. It remains to be shown only that 
(Vo, Uo) = 1. But by Equation (1.8) we know that u;—1v; — ujvj—1 = 1. 
Hence, uj—1Vo — Uovj—1 =1, and (Vo, Uo) =1. i 


In summary, for a syndrome polynomial S(z), to determine the error 
locator, error evaluator, and error coevaluator polynomials V(z), R(z), 
and U(z), we construct the Euclidean algorithm table for a(z) = z2% and 
b(z) = S(z). At the first row j for which deg(r;) < t, then r; = Ro = hR(z), 
uj = Up = hAU(z), and v; = Vo = AV(z). But since (Vo, Uo) = 1, then 
h = 1. Hence, r; = R(z), u; = U(z), and v; = V(z). By finding the 
roots of V(z), we can determine the locations of the errors in the received 
polynomial as described previously. To find the coefficients of the error 


polynomial terms, note that since V(z) = II (1 —a’z), then 


iEM 
VQ) = 5 —a) II (1— a'z). 
jEM iEM 
ij 


And recall we already know that 


R(z) = 5 ejaf [[a —a'z). 


JEM ieM 
tAj 


By evaluating the preceding polynomials at a71, we obtain the following. 


V’ (ai) = —a II (1 - a) 


iem 
tAj 


Ria?) = eja II (1 — g 


iEM 

iFj 

R(a~4 . ne i 

Hence, Va) = e; reveals the coefficient of x’ in the error polynomial. 
a 
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5.4 Binary Reed-Solomon Codes 


We still have two topics to address regarding Reed-Solomon codes. We 
stated in the introduction to this chapter that Reed-Solomon codes are 
uniquely ideal for correcting error bursts, but we have not yet discussed 
how or why. Also, we stated in Section 5.1 that Reed-Solomon codewords 
are transmitted as binary vectors, but we have not yet mentioned the actual 
form in which Reed-Solomon codewords are transmitted. It turns out that 
these two topics are intimately connected. We will first discuss the form in 
which Reed-Solomon codewords are transmitted. 


Consider the codeword c(a) in the Reed-Solomon code C in Example 
5.3. To convert c(x) to a binary vector, we would begin by listing the terms 
in c(z) as follows with increasing powers of x. 


c(x) = a? +a" r +a”?ax? + abr? + atr’ + atr? + ar? +02" 
4+ ax + abx? + gi? flO + ial + abr? + alel + abrt 
Next, we would write the coefficients in c(x) as follows, using the table in 


Example 5.3 to express each coefficient as a polynomial in a of degree less 
than the degree of p(x) with increasing powers of a. 


e(z) = (a+a?)+(1+a?+a?)z + (1+ a)s? + (a+a?+a°)2° 


+ (1+ a°)r* + (1+ a3)2° + (a?)x? + (1 +a)” 
+ (a +a? + a*)2® + (a?)2° + (14+ a)2? + (a +a?+a°)2" 
aif (a)r? KS (1+ a)r’ 4 (a+a? +a)a" 


Finally, we would express each of these coefficients of c(a) as binary vectors 
of length four positions by listing in order the binary coefficients of a. For 
example, we would express the first coefficient 0+1a+1a?+0a? of c(x) as the 
vector (0110). Using this method, we would convert the entire codeword 
c(x) into a binary vector of length 60 positions by listing together these 
binary vectors of length four positions, including four zeros for all terms 
that could be present in c(x) but have a coefficient of 0 (of which there are 
none in this codeword). That is, we would convert c(x) to the following 
binary vector of length 60 positions. 


(011010111100011110011001000111000111000111000111000111000111 ) 


It is clear by the fact that Reed-Solomon codewords are converted 
to binary vectors in this manner why Reed-Solomon codes are ideal for 
correcting error bursts. Specifically, four errors in the binary equivalent of 
c(x) constructed above could represent only a single error in c(x). Hence, 
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although c(x) is a codeword in a code that is only 3-error correcting, it may 
be possible to correct a received vector to c(x) even if 12 errors occur during 
transmission of the binary equivalent of c(a). More generally, regarding 
the code C, we would say that provided only one error burst occurs during 
transmission of the binary equivalent of a codeword in C, we are guaranteed 
to be able to correct the received vector as long as the error burst is not 
longer than nine positions. This is because any error burst of length not 
longer than nine positions in the binary equivalent of a codeword in C could 
not span more than three of the coefficients in the codeword, while an error 
burst of length ten positions in the binary equivalent of a codeword in C 
could span four of the coefficients in the codeword. This statement can be 
generalized in an obvious manner to apply to any RS(2” — 1,t) code (see 
Written Exercise 7). 


As we mentioned in the introduction to this chapter, there are many 
situations in which transmission errors in binary vectors occur naturally in 
bursts. It is for these situations that Reed-Solomon codes are ideal. 


5.5 Reed-Solomon Codes with Maple 


In this section we show how Maple can be used to construct codewords 
and correct errors in the RS(31,4) code C that results from the primitive 
polynomial p(x) = 2° + z? +1 € Zola]. 


We begin by including the Maple linalg package, entering p(x), and 
assigning the number 2° = 32 of elements in the field F = Z2[x]/(p(2)) as 
the variable fs. 


> with(linalg): 
> p := x -> x5 +x°3 +1: 
> Primitive(p(x)) mod 2; 
true 
> fs := 2°(degree(p(x))); 
fs := 32 


Next, as we did for the BCH code in Section 4.3, we generate and store the 
field elements in the vector field by entering the following commands. 


> field := vector(fs); 


field := array ( 1..32,[| ]) 
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> for i from 1 to fs-1 do 

> field[i] := Powmod(a, i, p(a), a) mod 2: 
> od: 

> field[fs] := 0: 

> evalm(field) ; 


t} a +a? =: La af deste de aè ar ee se 1G af +a? +a? +a, 
+a?’ +1,a+1, a° +a, Cea +a’, af +a’ +1, 


a 
a 
at +a? +a+1, af +a? +a? bat ico +a? heel, 
a 
a 


2 t a+1, aè +a? +a, af +a? +a’, at +1, a° od, 


4ta? +a, a? +1, aè +a, af + a°, 1, 0] 


As we also did for the BCH code in Section 4.3, we establish an association 
between the polynomial field elements and corresponding powers of a in the 
table ftable by entering the following commands. 


> ftable := table(): 

> for i from 1 to fs-1 do 

> ftable[ field[i] ] := a^i: 
> od: 

> ftable[ field[fs] ] := 0: 


5.5.1 Construction of the Codewords 


Before constructing any of the codewords in C, we must construct the 
generator polynomial for C. To do this, we first assign the number t = 4 
of errors the code is to be able to correct. We can then use the Maple 
product command as follows to construct the generator polynomial for C. 


>t := 4: 
product (x-a*j, j=1..2*t); 
gi= (z—a)(2—a2) (wa?) (z—a!) (2-08) (2-08) 


(x-a) (z-a?) 


v 
o 
‘i 


The process of expanding and simplifying this polynomial so that its co- 
efficients are written in a desirable form (as powers of a) is nontrivial as 
it requires several conversions between the polynomial field elements and 
powers of a. Hence, for performing this expansion and simplification, we 
have provided the user-written procedure rscoeff, for which code is given 
in Appendix C.1. If this procedure is saved as the text file rscoeff in the di- 
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rectory from which we are running Maple, then we can include the rscoeff 
procedure in this Maple session by entering the following command. 


> read rscoeff; 


We can then see the expanded and simplified form of the generator poly- 
nomial g(x) by entering the following command. 
> g := rscoeff(g, x, pla), a); 


g := xÈ +a x” +a” e+ ate +a" rt +a” r’ tata 


+a? z +a 


2 


The first two parameters in the preceding command are the polynomial we 
wish to simplify and the variable used in this polynomial. The final two 
parameters are the primitive polynomial p(x) in terms of the field element 
a = « followed by the field element a. 


Recall that C is the set of all multiples b(x)g(«) of degree less than 31 
with b(x) € Fa]. For example, consider the following polynomial b(x). 
> b := aW18*x78 + a7 20*x77 + a719*x76 + a7 23¥*x75 + a^6*x^4 


> + aT 2ex73 + a7 23*x72 + av 4*ex + a715: 


We can construct the codeword c(x) = b(x) g(a) € C that results from b(x) 
by entering the following command. Note that we use rscoeff so that c(<) 
will be displayed in a simplified form. 

> c := rscoeff(b*g, x, p(a), a); 

c := al 18 + alt g5 p al? glt + a3? gt p a? rl? p atr 


4 a5 x1? 4 G28 x? 4 a? x8 + a3? xT p at ê p a? rb pa?! rt 


+ a? x2 + aa + a2 


Recall that to transmit this codeword we would convert c(x) into a binary 
vector using the process described in Section 5.4. To perform this conversion 
we have provided the user-written procedure binmess,! for which code is 
also given in Appendix C.1. If this procedure is saved as the text file 
binmess in the directory from which we are running Maple, we can include 
the binmess procedure in this Maple session as follows. 


> read binmess; 


We can then find the binary vector that corresponds to c(x) by entering 
the following command. 
> cbin := binmess(c, degree(p(x)), p(a), a, fs-2); 


cbin := [1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 
0, 1, 0, 1, 1,.0, 1, 0, 0,1, 1, 1, 0, 0, 1, O, 1, 0,1, 1, O, 1, 1, 1, 0, 


1The binmess procedure uses the Maple stackmatrix command. See footnote p. 51 
regarding the stackmatrix command. 
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The first parameter in the preceding command is the polynomial for which 
we wish to find the binary equivalent. The last parameter is the largest 
possible degree fs-2 of the codewords in C. Although the codewords in C 
can be of degree up to 30, the degree of c(x) is only 16. Note that binmess 
recognizes that the terms in c(x) of degrees 17 through 30 have coefficients 
of zero and inserts appropriate zeros in the resulting binary vector. 


5.5.2 Error Correction 


Suppose a codeword in Č is transmitted as a binary vector and we receive 
the following vector rbin. 


> rbin := vector([1, 1, 1, 1, 1, 1, O, 1, 0, O, 1, O, 1, 1, 1, 
>07 0, 0; 0,0, 1, <1, 1,00, 7, 0, 15. 1,0, 1, 0,05. 15 1,-1,--0; 
> Ops 05° Lye Oy ts. Ly On- Ly. Aa 15-05. da Ea 04-05 1. 1.0. 052-25 
> 0, 2,0; 2,1, 0, 1,0, 1, 15 4,. 15 T3 0;.0;..0, 0; 1, 2,0, 1, 
> Ty 20. 15 0: 105-0 15-07... 15.105. 050,05: (155-0... 05. 4, 
> 1, 1, 1, 1, 0, 1, O, O, O, 1, 1, 1, O, O, 1, 1, O, O, O, O, 0, 
> 0, 0, 0, O, 0, 0, 0, O, 0, 0, O, O, O, O, O, O, O, O, O, O, 0, 
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) 


To correct this received vector we must first convert the vector to its poly- 
nomial equivalent. To help with this conversion we have provided the user- 
written procedure bincoeff, for which code is given in Appendix C.1. As- 
suming this procedure is saved as the text file bincoeff in the directory from 
which we are running Maple, we can include this procedure in our Maple 
session as follows. 


> read bincoeff; 


Then, by entering the following command we obtain an ordered list of the 
coefficients in the polynomial equivalent of rbin. The first parameter in 
the following command is the number of binary digits in rbin that should 
be used to form each of these coefficients. 


> pcoeff := bincoeff(5, rbin); 


pcoeff := lat +a? +a? +a+1, a? +1, at +a? +a? +1,0, 
at +a? +a+1, af +a? +a, af +a? +a’, af + a?, 
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at he +a, af ba +a+Hl, a ee +a, a? +aHl, 

at +a? +a? Pot he, +a? +1, taf lo at 
af +a, at +a? +1, aô +a? +a+1, af +a’, at +a’ +1, 
0, 0, 0, 0, 0, 0, 0, 0] 


We can then construct a polynomial r(x) with these coefficients by entering 
the following command. 
> r := sum(’pcoeff[i]*x^(i-1)?, ?i?’=1..vectdim(pcoeff)); 
r:=1+a+a? +a? + at + (a? +1) 24 (at +a’ +0? +1)2? 
(af +a? +44 1)a*+4 (af +a? +4) 2° + (af +a’? +a?) r’ 


af +a?) r" + (a* +a? +a) 2° + (at +a +a+1)2° 


a t 
at tat 1) wi (a? a?) gl? (af a?) 78 
2 


a ee +1 


We can use rscoeff as follows to simplify the coefficients in the preceding 
polynomial r(x). 


> r := rscoeff(r, x, p(a), a); 


r:= qi8 22 + at g?l ai a2! x2 + at8 79 tal’? r +a 
10 ,,16 7,15 414 a2? 8 +a% al +a? git 


+a 2 a’ x ax 
4+ alô x1? + al? x? + a? x8 + a3? x7 p at xô p a? r’ 
21,4, 9,2 , 28 20 


Finally, we enter the following unapply command so that we can evaluate 
r(x) in the usual manner. 
> r := unapply(r, x); 


r:= gr —> at? 22 4 at gx?! Te qa?! x + qi8 9 + qi? 78 + qi gi? 
on at? 716 T a15 414 20 ,,13 + a26 al fae a2? git 


ax ax a x 
al ge lO + al? x? + a? x8 4.80 x7 p 24 p a? r’ 
a?! xt + a? r? + a? r Ha? 


We will now use the Reed-Solomon error correction scheme to correct 
r(x). Recall that to correct r(x) we begin by computing the first 2t syn- 
dromes of r(x). Before doing this we create a vector Sa of length 2t positions 
in which to store the syndromes. In the subsequent loop we generate and 
store the first 2t syndromes of r(x) in Sa. Note that we use the Maple 
Rem function to simplify the syndromes, and that we use ftable to find 
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the representations of the syndromes as powers of a. The if statement that 
appears in the loop reduces the syndromes to 1 if they are expressed as a 
raised to the order of F*. 


> Sa := vector(2*t); 


Sa := array(1..8,[]) 


> for i from 1 to 2*t do 

> Sali] := ftable[ Rem(r(a*i), p(a), a) mod 2 ]: 
> if degree(Sa[i], a) = (fs-1) then 

> Sali] := Sa[i]/a^(fs-1): 

> fi: 

> od: 

> evalm(Sa) ; 


Next, we use the Maple sum command to form the resulting syndrome 
polynomial S(z), and use unapply to convert $(z) into a function that we 
can evaluate in the usual manner. 

> S := sum(’Sa[j+i]*z*j’, ?j’=0..2*t-1); 


214 4 3 6 


S:=a +a” z +a +a z Ha haltz +az Fa z 


> S := unapply(S, z); 


We must now construct the Euclidean algorithm table for S(z) and the 
following polynomial f(z) = 2%. 
> f := 27 (2*t); 


f:=28 


To perform the calculations necessary in constructing this table, we have 
provided the user-written procedure rseuclid, for which code is given in 
Appendix C.1. Assuming this procedure is saved as the text file rseuclid in 
the directory from which we are running Maple, we can include this pro- 
cedure in our Maple session as follows. (Note: Because rseuclid calls and 
uses the rscoeff procedure we discussed previously, the rscoeff procedure 
must be saved as the text file rscoeff in the same directory as rseuclid.) 


> read rseuclid; 


Then the following command causes Maple to construct and display the 
entries in each row of the Euclidean algorithm table for S(z) and f(z), stop- 
ping at the appropriate row for the Reed-Solomon error correction scheme. 
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The first parameter in this command is the number of errors that can be 
corrected in the code. The next two parameters are the two polynomials for 
which we are constructing the table. The fourth parameter is the variable 


z used in the previous two parameters. 


> res := rseuclid(t, f, S(z), z, p(a), a); 

Q Soar gob ae, R =, 
a 278 pa? z5 ta 24 | a? 2? | a? 2? a z i a. V 
a ear. U =,1 


Q = 5a 2+, R =, 


al? 25 pa 24 4 93 2 ataza, V =, 
a z? +a? z+a®, U =,a2+a° 
Q =,a°z+a”", R=, 
a2 z4 2134 926 224 G27 zta, V =, 
alt 23 424 Nyt U =, al 22 +a 2404 
Q =,a0%z+08, R atta ta zta, V 
z+ +a 2+, U =, 
aè z3 +a? z? +a” 2+ a"! 


res := a? z +a’, a” 2? fake +a! z+ a7, 
5 4 24 24 gt z+ a9, a8 23 + a6 22 4+ a8 zta?! 


a 244+ 22 +4+a%4z 


Note that the preceding process stops at the first row for which the degree 
of the entry in the R column is less than t = 4. Note also that the process 
leaves the vector res containing the entries in the last computed row of 
the table. Hence, the entries in the table of which we are interested, the 
polynomials R(z) in the R column and V(z) in the V column, are the 
second and third entries in res. We define these entries next as the variables 
R and V, and use unapply to convert each to a function that we can evaluate 
in the usual manner. 


> R := res[2]; 
R:= 0t 2 paT z2 pal’ zp a2 
> R := unapply (R, z); 
R:= 2z > a” 2 ta z2 al zH a? 
> V := res[3]; 
V = ač 2t t a taaz Haaa 
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> V := unapply(V, z); 
V = z > tt 2 aaaea 


Next, we find the roots of V by trial and error as follows. These commands 
also show the corresponding error positions in r(x). 


> for i from 1 to fs-1 do 


> if (Rem(V(a*i), p(a), a) mod 2) = 0 then 

> print(a*i, ‘ is a root of <, V(z), < error 

> position is ‘, degree(a^(fs-1)/a^i, a)); 

> fi: 

> od; 

at, is a root of ; a zt + 23 +a 2z taz +a, 
error position is ,16 

Gee is a root of a2 +2? +a” z? +a? zta, 
error position is ,15 

a", is a root of a24 + 2? +a” z? +a? zta, 
error position is ,14 

at’, is a root of a 2 + z2 +a” z? ta? azta, 
error position is ,13 


To find the coefficients of the terms in the error polynomial, we need to 
find the derivative of V. We find this derivative next using the Maple diff 
command. 


> Vp := diff (V(z), z) mod 2; 


Vp := 2? +a? 


> Vp := unapply(Vp, z); 


Vp := 2z > 2? +a? 


We can then define the coefficients of the four terms in the error polynomial 
as follows. 
> e13 := ftable[ Rem(R(a^18), p(a), a) mod 2 ] / 


> ftable[ Rem(Vp(a^18), p(a), a) mod 2 ]: 
> e14 := ftable[ Rem(R(a^17), p(a), a) mod 2 ] / 
> ftable[ Rem(Vp(a^17), p(a), a) mod 2 ]: 
> e15 := ftable[ Rem(R(a^16), p(a), a) mod 2 ] / 
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> ftable[ Rem(Vp(a^16), p(a), a) mod 2 ]: 
> e16 := ftable[ Rem(R(a^15), p(a), a) mod 2 ] / 
> ftable[ Rem(Vp(a^15), p(a), a) mod 2 ]: 


Next, we form the error polynomial e(x) that corresponds to r(x). 


> 6 i 


>e 


Finally, by adding this error polynomial to r(x), we obtain the following 


e16*x716 + e15*x715 + e14*x714 + e13*x713; 


e := att 


716 eh ax ae qi? xlt J a’ xt 
:= unapply(rscoeff (e, x, p(a), a), x); 


e := x > alf xt? + ax!” + al? grlt +a? x 


corrected codeword c(x) € C. 


> c := rscoeff(r(x)+e(x), x, p(a), a); 

c= qi8 2? + ai? g?! oe a21 20 az at8 xt? at ai? xt’ + qi gi? 
4+ atr! p al! gt + a? rt + al? t3 + a rl? +a? r! 
4 al g1 4 a! g? 4 a? gê 4 a3? xT 4 at rô pa? rt 


+a?! rt + a? r? +a rpa 


The following command shows that c(x) is in C by verifying that c(a’) = 0 


for i = 1, 


hi 2t. 


> seq(Rem(subs(x=a^i, c), p(a), a) mod 2, i=1..2*t); 


To see the positions in rbin that contained errors, we can use the binmess 


0,0,0,0,0,0,0,0 


routine as follows to find the binary representation of e(x). 


> ebin 


ebin := [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 


We can then use the Maple sum command as follows to see the number of 


:= binmess(e(x), degree(p(x)), p(a), a, fs-2); 


9 


0, 0, 0 
0,1,1 
0, 0, 0, 
0, 0, 0 
0, 0, 0 


2 Fi 


0 
0 
1 
0, 
0 
0 


binary errors in rbin. 


> berrors 


:= sum(’ebin[i]’, ’i’=1..vectdim(ebin)); 


berrors := 8 
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Note that although C is only 4-error correcting, we were able to correct the 
binary vector rbin in C even though it contained 8 errors. This is because 
the binary errors in rbin occurred together (i.e., as an error burst) and 
resulted in only four errors in the corresponding polynomial r(z). 


5.6 Reed-Solomon Codes in Voyager 2 


In August and September 1977, NASA launched the Voyager 1 and Voyager 
2 satellites from Cape Canaveral, Florida. Upon reaching their first desti- 
nation goals of Jupiter and Saturn, the Voyager satellites provided NASA 
with the most detailed analyses and visual images of these planets and their 
moons that had ever been observed. After encountering Jupiter and Sat- 
urn, Voyager 2 continued farther into the outer reaches of our solar system 
and successfully transmitted to Earth data and visual images from Uranus 
and Neptune. Without the use of a Reed-Solomon code in transmitting 
these images, the extreme success achieved by Voyager 2 would have been 
very unlikely. We briefly describe the image transmission process next. 


Images transmitted to Earth from outer space are usually digitized 
into binary strings and sent over a space channel. Voyager 2 digitized 
its full-color images into binary strings with 15,360,000 positions. Using 
an uncompressed spacecraft telecommunication system, these binary digits 
were transmitted one by one to Earth, where the images were then recon- 
structed. This uncompressed system was the most reliable system available 
when Voyager 2 was launched, and was satisfactory for transmitting im- 
ages to Earth from Jupiter and Saturn. However, when Voyager 2 arrived 
at Uranus in January 1986, it was about twice as far from Earth as it had 
been when at Saturn. Since the transmission of binary digits to Earth had 
already been stretched to a very slow rate from Saturn (around 44,800 dig- 
its per second), a new transmission scheme was needed in order for NASA 
to be able to receive a large number of images from Uranus. 


The problem of image transmission from Uranus was solved through 
the work of Robert Rice at California Institute of Technology’s Jet Propul- 
sion Laboratory. Rice developed an algorithm that implemented a com- 
pressed spacecraft telecommunication system which reduced by a factor of 
2.5 the amount of data needed to transmit a single image from Uranus with- 
out causing any loss in image quality. However, there was a problem with 
Rice’s algorithm. During the long transmissions through space, the com- 
pressed binary strings experienced errors much more frequently than the 
uncompressed strings, and Rice’s algorithm was very sensitive to binary 
errors. In fact, if a received binary transmission from Uranus contained 
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even a single error, the resulting image would be completely ruined. After 
considerable study, it was discovered that the binary errors that occurred 
during the long transmissions through space usually occurred in bursts. To 
account for these error bursts, a new system was designed in Voyager 2 for 
converting images into binary strings that utilized a Reed-Solomon code. 
The binary strings were compressed and transmitted back to Earth, and 
then uncompressed using Rice’s algorithm and corrected using the Reed- 
Solomon error correction scheme. This process was highly successful. The 
specific Reed-Solomon code used in Voyager 2 is mentioned in Maple Ex- 
ercise 4. 


After leaving Uranus, Voyager 2 continued its journey through space. 
In August 1989 the satellite transmitted data and visual images to Earth 
that provided NASA with most of the information currently known about 
Neptune. At present, the Voyager 2 satellite is still in operation and is still 
providing NASA with invaluable information about our solar system. 


In addition to being used in the transmission of images through space, 
Reed-Solomon codes have a rich assortment of other applications, and are 
claimed to be the most frequently used digital error-correcting codes in the 
world. As we mentioned in the introduction to this chapter, Reed-Solomon 
codes are used extensively in the encoding of various types of information on 
compact discs. Also, Reed-Solomon codes have played an integral role in the 
development of high-speed supercomputers. In the future, Reed-Solomon 
codes will be an important tool for dealing with complex communication 
and information transfer systems. 
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Written Exercises 


1. Let C be the RS(7, 2) code that results from the primitive polynomial 


plx) = x? +s ++1 E€ Za}. 


Construct and simplify the generator polynomial for C. 


) 
b) Construct two of the codewords in C. 
) 


— 
O 


Convert the codewords you constructed in part (b) into binary 
vectors using the process described in Section 5.4. 


(d) Find the maximum error burst length that is guaranteed to be 
correctable in C. 


. Correct the following received polynomials in the Reed-Solomon code 
C in Written Exercise 1. 


(a) r(x) = aba’ + ax5 + azt + ax? + atr + až 
(b) r(x) = ata® + aa? + zt + atr’? + aa? + a?x + a’ 
(c) r(x) = av? + atx* + afr? + aba? + abr + at 


. Use the Reed-Solomon error correction scheme to correct the following 
received polynomial r(x) in the BCH code C in Example 4.3. 


r(x) = 1l+ett+e te’ +r tao +2 


. Correct the following received polynomials in the Reed-Solomon code 
C in Example 5.3. 


(a) r(x) = ax!? + a?r!!! + asr! + afr? + 784 a'z’ + ax 
a®x® + atat + a'r’ + ax? + abe 
(b) r(x) = arli 4 alr! + a3r!? + afr!! + altr! + atr? 
alg’ + ax" + allr’ + alx + altri + aTr? 
L allg 4 að 


. Use the reverse of the process described in Section 5.4 to convert the 
following binary vector to the polynomial codeword it represents in 
the Reed-Solomon code C in Example 5.3. 


( 110101010000111000111010101111101011100100111001000100111101 ) 


. Prove Theorem 5.1. 


. Let C be an RS(2”—1, t) code. In terms of n and t, find the maximum 
error burst length that is guaranteed to be correctable in C. 
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Maple Exercises 


1. Correct the following received binary vector r in the RS(31,4) code 
C considered in Section 5.5. 


r = (1010010001010011010000010101010011110110000101010010 
1101111101000101001101000001110111110000000011000001 
000110110010000000000000000000000000000000000000000) 


2. Let C be the RS(127,4) code that results from the primitive polyno- 
mial p(x) = £” +x +1 € Za]. 
(a) Construct and simplify the generator polynomial g(x) for C. 
(b) Construct the codeword b(x)g(x) € C that results from the fol- 
lowing polynomial b(x). 


b(a) = ar10 4 a25! 4 gy? 4 a? 


(c) Convert the codeword b(x)g(x) in part (b) into a binary vector 
using the process described in Section 5.4 (i.e., using the bin- 
mess procedure). 

(d) Find the maximum error burst length that is guaranteed to be 
correctable in C. 


3. Correct the following received polynomials in the Reed-Solomon code 
C in Maple Exercise 2. 


(a) r(x) = at 722 us at? gx?! a 00.720 ae a28 719 + qil4,18 


+g glT 4 g8lyl6 4 arl 4 gd6yl4 4 g59y13 
4 gd8z12 4 g83q11 + g42_10 


(b) r(x) = ql07108 4 g607107 4 497106 4 g1157105 4 g18,,104 
4 gi247103 4 q67 7102 4 g877101 4 946,100 


(c) r(x) = atO 7.81 ajh a23 x8? + aig? + qi 8x8 4 8 7-77 
+ a8 776 + a575 a a50 r74 =e a? x3 


4. The Reed-Solomon code used in the Voyager 2 satellite (see discus- 
sion in Section 5.6) was the RS(255,16) code C that results from the 
primitive polynomial p(x) = x8 + 2+ + x? +x? +1 € Z[z]. Construct 
several of the codewords in C as polynomials or binary vectors. Also, 
illustrate the Reed-Solomon error correction scheme in C by correct- 
ing several received binary vectors or polynomials that contain errors. 


Write a summary of your results. 
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Chapter 6 
Algebraic Cryptography 


Cryptography is the study of techniques that can be used to disguise a mes- 
sage so that only the intended recipient can remove the disguise and read 
it. The simplest way to disguise a message is to replace every occurrence 
of each specific character with a different character. This method for dis- 
guising a message yields what we will call a substitution cipher. Since sub- 
stitution ciphers appear as puzzles in many newspapers and puzzle books, 
they are obviously relatively easy to “break” and should not be used when 
sending “top secret” information. In the next three chapters, we discuss 
some disguising techniques that involve applying mathematical operations 
to messages. Using mathematics to disguise messages gives us the abil- 
ity to create disguising techniques that are increasingly more difficult to 
break by simply choosing mathematical operations that are increasingly 
more complex. Because the disguising techniques we discuss involve apply- 
ing mathematical operations to messages, these techniques are examples of 
algebraic cryptography. 


6.1 Some Elementary Cryptosystems 


We will call an undisguised message a plaintext and a disguised message 
a ciphertext. Also, we will call the process of converting a plaintext to a 
ciphertext the encryption or encipherment of the message, and we will call 
the reverse process the decryption or decipherment of the message. 


Since we will encipher messages by applying mathematical operations, 
our plaintext characters will have to have some kind of mathematical struc- 
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ture. We will give our messages the structure of a ring so that we can both 
add and multiply message characters. 


Definition 6.1 A cryptosystem consists of the following: 


1. An alphabet L that contains all characters that can be used in mes- 
sages (letters, numerals, punctuation marks, blank spaces, etc. ), 


2. A commutative ring R with identity such that |R| = |L], 


3. Bijections a : L — R and f : R—> R. 


The idea we will take is that to encipher a plaintext that is expressed 
as a list of elements in L, we first use a to convert the plaintext into a list 
of elements in R. We can then form the ciphertext by applying f to the 
plaintext ring elements and, if desired, use a~! to convert the ciphertext 
back to a list of elements in L. We can recover the plaintext from the ci- 
phertext by repeating this procedure using f7! instead of f. So that only 
the intended recipient of the message can recover the plaintext, only the in- 
tended recipient can know f~t. We will always assume that everything else 
in a cryptosystem, with the obvious exception of f, is public knowledge. (It 
is true in some cryptosystems that f can be public knowledge without re- 
vealing f~!. Such cryptosystems are called public-key systems. We discuss 
two public-key cryptosystems, the well-known RSA and ElGamal systems, 
in Chapters 7 and 8.) 


For simplicity, in this chapter we will assume that all messages are 
written in the alphabet L = {A,B,...,Z}. Also, we will take R = Zo¢ 
and let a : L — R be given by a(A) = 0,a(B) = 1,...,a(Z) = 25. For 
reference, we list the correspondence for a below. 


A B C D E F G H I J K L M N O 
0 1 2 3 4 5 6 7 8 9 10 11 12 13 «14 


P Q R S T U V W X Y Z 
15 16 17 18 19 20 21 22 23 24 25 


We now consider some cryptosystems with different types of encryption 
methods. That is, we consider some cryptosystems with different types of 
bijections f : R> R. 


Encryption Method 1: Choose f : R — R by f(x) = ax mod |R| for 
some a € R with (a, |R|) = 1. 
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Example 6.1 Let f(x) = 32 mod 26. Then the message “ATTACK AT 
DAWN” enciphers as follows. 


A T T A C kK A T D A W N 
a => 0 19 19 0 2 10 0 19 3 0 22 18 
f => 0 5 5 0 6 4 0 5 9 0 14 B 
at> A F F A G E A F J AON 


Hence, the corresponding ciphertext is “AFFAGEAFJAON”. To decipher 
this message, we repeat the same procedure using the inverse function 
f-l(a) = a-tx mod |R| = 9x mod 26 instead of f. (Note: We can de- 
termine a~! in general by using the Euclidean algorithm as illustrated in 
Section 7.1.) This message deciphers, of course, as follows. 


A F F A G E A F J A O N 
a => 0 5 5 0 6 4 0 5 9 0 14 183 
fis 0 19 19 0 2 10 0 19 3 0 22 13 
at> A T T A C K A T DAW N 


Note that 371 = 9 mod 26 because 3-9 = 27 = 1 mod 26. Note also that 
we are guaranteed a multiplicative inverse of a = 3 exists modulo |R| = 26 
because of the requirement that (a, |R|) = 1 (see Written Exercise 10). m 


We will say that any person except the intended recipient who tries to 
decipher an encrypted message is an intruder and is attempting to break 
the cryptosystem. Two people wishing to exchange a secret message would 
certainly want to use a cryptosystem that an intruder would find difficult to 
break. However, breaking a cryptosystem is not generally as difficult as it 
may at first appear. Recall that everything in a cryptosystem is assumed to 
be public knowledge except f and f~', and in practice it is usually assumed 
that even the form of f is publicly known and that only the parameters in f 
are unknown to intruders. We will call the parameters in f the keys of the 
cryptosystem because if an intruder is able to find these parameters, the 
intruder should then be able to determine f~! (i.e., “unlock” the system). 


The cryptosystem in Example 6.1 has a = 3 as its only key. As you may 
have supposed, this system is not very secure. However, this is not because 
it has only a single key or because it is so easy to use for enciphering mes- 
sages. Recall we would assume that intruders know everything about this 
system except the parameters in f and f~!. Indeed, we would even assume 
that intruders know f(x) = ax mod 26 for some a € Zə with (a,|R|) = 1. 
With a function f of this form, an intruder could very quickly determine the 
key by trial and error. Specifically, because the only elements in Zə6 that 
are relatively prime to 26 are {1,3,5,7,9, 11, 15,17,19,21, 23,25}, then one 
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of these elements must be the key. An intruder could very easily take each of 
these key candidates separately, form the corresponding inverse functions 
f-t, and use each f~! to decipher the encrypted message. Most likely, 
only one of these decipherments, the correct plaintext, will make any sense. 
Even if the calculations are done with a pencil and paper, an intruder could 
break this system in only a few minutes. 


One obvious way to enhance the security of the cryptosystem in Ex- 
ample 6.1 is to include a constant term in the function f. We summarize 
this in general as our second encryption method. 


Encryption Method 2: Choose f : R — R by f(x) = ax +b mod |R] for 
some a,b € R with (a, |R|) = 1. 


Example 6.2 Let f(x) = 3x +4 mod 26. Then the message “ATTACK 
AT DAWN” enciphers as follows. 


A T T A C K A T D AW N 
a => 0 19 19 0 2 10 0 19 3 0 22 13 
f >= 4 9 9 4 10 4 9 13 4 18 17 
at> E J J E K I E J N E R 


Hence, the corresponding ciphertext is “EJJEKIEJNESR”. To decipher 
this message, we repeat the same procedure using the inverse function 
f! (x) = a~t (x — b) mod |R| = 9(x — 4) mod 26 instead of f. E 


The cryptosystem in Example 6.2 has two keys, a = 3 and b = 4. While 
this system is more secure than the cryptosystem in Example 6.1, it is still 
not a very secure system. We would assume that an intruder who intercepts 
the ciphertext in Example 6.2 would know that f(x) = ax + b mod 26 for 
some a,b € Z26 with (a, |R|) = 1. Hence, since a must be one of the twelve 
elements in {1,3,5,7,9, 11,15, 17,19,21,23,25}, and b must be one of the 
26 elements in Z26, there will only be 12 - 26 = 312 possible pairs (a,b) of 
keys. An intruder using only a hand-held calculator could easily test each 
of these pairs of key candidates in only a few hours. More importantly, 
an intruder using a computer that can perform millions of operations per 
second could test each of these pairs of key candidates immediately. Thus, 
although the cryptosystem in Example 6.2 has more key candidates than 
the system in Example 6.1, it can still be broken very easily. 


We have shown that neither of the cryptosystems described in this 
section are secure by presenting rather simple mathematical methods for 
breaking them. We can also see that these systems are not secure because 
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they both just yield substitution ciphers. And the systems described in 
this section are actually easier to break than non-mathematical substitu- 
tion ciphers, for in each case our general procedure for breaking the system 
can easily be programmed on a computer, while breaking nonmathemat- 
ical substitution ciphers generally requires somewhat time-consuming fre- 
quency analysis (i.e., trial and error under the assumption that the most 
frequently occurring ciphertext characters correspond to the most com- 
monly used plaintext characters). However, the cryptosystems described in 
this section were not presented in an attempt to illustrate secure systems, 
but rather because they will be generalized in Sections 6.2 and 6.4 into 
well-known systems that can be designed with any desired level of security. 


6.2 The Hill Cryptosystem 


The cryptosystems described in Section 6.1 have been known for many 
years. In fact, a variation of our second encryption method in Section 
6.1 was used in ancient Rome by Julius Caesar, who supposedly invented 
it himself (see Written Exercise 1). While the advent of calculators and 
computers have rendered these systems obsolete, generalizations of these 
systems that use matrices as keys instead of scalars can still be constructed 
with any desired level of security. The generalization of our first encryption 
method was first described by Lester Hill in 1929. We summarize this in 
general as our next encryption method. 


Hill Encryption Method: Let A be an n x n invertible matrix over R 
(i.e., such that (det A, |R|) = 1). Group the plaintext into row vectors P; 
of length n, and define f : R” — R” by f(P;) = P;A with each entry taken 
modulo |R|. The resulting rows listed together form the ciphertext. 


Example 6.3 In this example, we use the Hill encryption method to enci- 
pher the message “MEET AT SEVEN”. We begin by converting the message 
into a list of elements in Zo. 


M E E T A T S E V E N 
a > 12 4 4 19 0 19 18 4 21 4 13 
We will use the following 2 x 2 key matrix A to encipher this message. 
2 5 
ao 
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Note that A is invertible over Z2¢ since (det A, 26) = (3,26) = 1. To form 
the ciphertext, we group the plaintext into row vectors P; of length 2, and 
compute P;A for all i with each entry taken modulo 26. For example, the 
first ciphertext vector is computed as follows. 


pa 2 aza] i A 


= (28,76) 
= (2,24) 
The remaining ciphertext vectors are computed as follows. (Since the mes- 


sage does not completely fill the last plaintext vector Pg, we fill this vector 
with an arbitrary element from Z326.) 


PA = (4,19)A = = (1,18) 

PA = (0,19)A = >- = (19,24) 

PA = (18,4)A = = (14,2) 

PsA = (21,4)A = > = = (20,17) 

PA = (13,25)A = = (25,9) 

Hence, the entire encipherment is 

M E E A S E V E N 
a => 12 4 4 19 0 19 18 4 21 4 #138 25 
f > 2 24 1 18 19 24 14 2 20 17 #2 9 
at> C Y BS T Y C U R Z J 


and the ciphertext is “CYBSTYOCURZJ”. Although the last ciphertext 
character is in a position beyond the last plaintext character, it must be 
retained as it is necessary for decipherment. E 


One thing we can notice immediately from Example 6.3 is that the Hill 
encryption method does not in general yield a substitution cipher. Also, 
enciphering messages with the Hill system requires nothing more than some 
matrix multiplication with an invertible key matrix A over R. A matrix A 
over R is invertible if and only if the determinant of A has a multiplicative 
inverse in R (see Written Exercise 9). For R = Z,, this is equivalent to 
(det A, k) = 1 (see Written Exercise 10). 


To decipher a message that has been encrypted using the Hill system 
with an n xn key matrix A, we would group the ciphertext into row vectors 
C; of length n and compute f~!(C;) = C;A~+ mod |R] for all i. The matrix 
A`! over R can be determined in general by the well-known formula 


x 1 3 
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where adj A represents the adjoint of A. To determine adj A, we would first 
need to find the cofactors of A. These cofactors are defined as 
Ci; = (1) Mi; ij=1,...,n 


where M;; is the determinant of the matrix obtained by deleting the i*” row 
and jt” column from A. Using these cofactors, the adjoint of A is defined 
as follows. 


Ci C21 T Chi 

, C12 C22 nor Cho 
adj A = l { ‘ 

Cin Can as Cnn 


That is, adj A is the transpose of the matrix of cofactors of A. 


Example 6.4 To decipher the message in Example 6.3, we would first need 
to find the inverse of the key matrix A. Using (6.1), we can determine this 
inverse as follows. 

fee 1 | 4 -5 


-1 2 


Then to recover the plaintext, we group the ciphertext into row vectors Ci 
of length 2 and compute C;A~! for all i with each entry taken modulo 26. 
For example, the first plaintext vector is recovered as follows. 


Lao a 10 7 
QAt = 20 | ir d 
= (428, 446) 

(12, 4) 


The remaining plaintext vectors are recovered as follows. 


C,A-1 = (1,18)A-2 = >- = (4,19) 
C3A-1 = (19,24)A-2. = --- = (0,19) 
CAT! = (14,2)AT! = ess = (18,4) 
CsA-1 = (20,17)A72 = ++» = (21,4) 
CoA? = (25,9)A72 = = (13,25) 
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Applying a7! to the entries in these plaintext vectors will reveal the original 
message. (Because we chose an arbitrary element from Z¢ to fill the last 
plaintext vector Pg in Example 6.3, there will be an extra character at the 
end of this message.) a 


We now consider how an intruder could break the cryptosystem in 
Example 6.3. We would assume that an intruder who intercepts the 
ciphertext in Example 6.3 would know that the ciphertext was formed 
by grouping the plaintext into row vectors P; of length 2 and multiplying 
each P; by some 2 x 2 invertible key matrix A over Zog. Although the 
requirement that A be invertible does not impose any specific restrictions 
on any of the four individual entries in A, an intruder would at least know 
that each of these entries must be elements in Z2g. Hence, to find A by trial 
and error, an intruder would have to test a maximum of only 264 = 456976 
possible key matrices. While it would not be feasible for an intruder to 
test all of these possible key matrices by hand or even with a calculator, 
an intruder using a computer that can perform millions of operations per 
second could test these possible key matrices very quickly and easily. Hence, 
the Hill system with a 2 x 2 key matrix A does not yield a very secure 
system. However, if A were chosen of size 3 x 3, then there would be 
26° = 5429503678976 possible key matrices for an intruder to test; and if 
A were chosen of size 5 x 5 there would be 267° = 2.37 x 103% possible key 
matrices. Thus, even with a relatively small key matrix, the Hill system 
yields a reasonable amount of security. And the Hill system can be used to 
obtain any desired level of security by simply choosing a key matrix that is 
sufficiently large. 


The Hill system does have a vulnerability we should mention. It is 
not unreasonable to suppose that an intruder who intercepts a ciphertext 
formed using the Hill system might know or be able to guess a small part 
of the plaintext. For example, the intruder may know from whom the 
message originated and correctly guess that the last few characters in the 
plaintext were the originator’s name or signature. It turns out that it may 
be possible for an intruder to break the Hill system relatively easily if the 
intruder knows or is able to correctly guess a small part of the plaintext. 
More specifically, if the Hill system is used with an n x n key matrix, it 
may be possible for an intruder to break the system relatively easily if the 
intruder knows or is able to correctly guess n? characters from the plaintext. 
We illustrate this in the following example. 


Example 6.5 Suppose an intruder intercepts the ciphertext in Example 
6.3 and somehow knows or guesses that the last four ciphertext letters were 
produced by the plaintext letters “VENZ”. That is, suppose the intruder 
knows or guesses the following from the encipherment in Example 6.3. 
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V E Z 
Ae es 21 4 133 25 
f => 2 24 1 #18 19 24 14 2 20 17 25 9 
at> C Y B S T Y O C U R Z J 


Since the intruder would know that the plaintext was enciphered using some 
2 x 2 key matrix 


a b 
efo] 
over Zə26, then the intruder would know that 
a b a b 
(21,4) | sg = (20,17) and (13, 25) | SF = (25,9) 


with a,b,c,d € Zog. The preceding two matrix equations are equivalent to 
the following single matrix equation. 


21 4 a b 20 17 
lis alle al-da] 62) 
If the intruder could determine a unique solution to this equation over Z26, 


then this solution would necessarily be the key matrix A for the system. 


Note that since 
| 21 4 


13 TE 


has an inverse in Z26, then (6.2) has a unique solution over Zog. Using 
(6.1), the intruder could find this solution as follows. 


a b 2 4 JFT 20 17 

c d 13 25 25 9 
o= 1f 235 —4][20 17 
= | -13 21 25 9 


yal 2 22] [ 20 17 
A 13 2 || 25 9 


II 


O 22050 13083 
16485 8610 


_ 2 5 
a 1 4 
Note that 5~! = 21 mod 26 because 5-21 = 105 = 1 mod 26. Note also 


that the last result is the key matrix A from Example 6.3. The intruder 
could then find A~! and decipher the rest of the ciphertext. ] 
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For a message enciphered using the Hill system with an n x n key 
matrix, one obvious problem an intruder could encounter when trying to 
break the system in the way illustrated in Example 6.5 is that even if 
the intruder knows or correctly guesses n? characters from the plaintext, 
the analogue to (6.2) may not have a unique solution. And even if (6.2) 
has a unique solution, it may not be possible to find this solution in the 
way illustrated in Example 6.5. For example, if an intruder intercepts the 
ciphertext in Example 6.3 and correctly guesses that the first four ciphertext 
letters were produced by the plaintext letters “MEET”, the analogue to 
(6.2) would be the following. 


12 4 ]fa b 2 24 
È alle plea | (6:3) 
But since 
12 4 
E ip |= 22 =4 


does not have a multiplicative inverse in Z26, then even if (6.3) has a unique 
solution over Zg¢ it will not be possible to find this solution in the way 
illustrated in Example 6.5. 


6.3 The Hill Cryptosystem with Maple 


In this section we show how Maple can be used to encipher and decipher 
messages using the Hill cryptosystem. 


We begin by establishing the correspondence a between the alphabet 
letters and ring elements. To do this, we construct the following array 
letters containing the alphabet letters. 


> letters := array(0..25, [A, B, C, D, E, F, G, H, I, J, K, L, 
> M, N, 0, P, Q, R, S, T, U, V, W, X, Y, Z]): 
We can then access the alphabet letters from their positions in this array, 
with the first letter being in position 0. For example, because C’ is the 


third letter in this array, the letter C is returned when its corresponding 
ring element is entered as follows. 


> letters [2]; 
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So that we can also access the ring elements by entering the alphabet letters, 
we will also establish the correspondence a in a table. We first create the 
table ltable. 

> ltable := table(): 


We can then use the array letters to establish the correspondence a in 
ltable by entering the following commands. 

> for i from 0 to 25 do 

> ltable[ letters[i] ] := i: 


> od: 


Using ltable, we can access the ring elements by entering the correspond- 
ing letters. For example, the ring element that corresponds to the letter C 
is returned by the following command. 

> ltable[C]; 


2 


We now show how Maple can be used to encipher the message 
“RENDEZVOUS AT NOON”. We will use the following 3 x 3 key matrix A. 

> with(linalg): 

> A := matrix( [ [11,6,8], [0,3,14], [24,0,9] ] ); 


By entering the following commands, we verify that A is a valid key matrix 
by verifying that the determinant of A is relatively prime to the number of 
alphabet letters. 

> d := det(A) mod 26; 


> gced(d, 26); 


Next, we enter the plaintext as the vector ptext. Note that we include two 
extra letters in this vector so that the number of plaintext characters will 
be a multiple of the number of rows of A. 
> ptext := vector([R, E, N, D, E, Z, V, 0, U, S, A, T, N, O, O, 
> N, A, Al); 


ptext := |R, E, N, D, E, Z, V, O, U, 5, A, T, N, O, O, N, A, A] 
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The following command returns the number of plaintext characters. 


> vectdim(ptext) ; 


18 


By entering the following loop, we convert the list ptext of plaintext letters 
into a list of ring elements. We then use evalm to display the result. 


> for i from 1 to vectdim(ptext) do 
> ptext[i] := ltable[ ptext[i] ]: 
> od: 


> evalm(ptext) ; 
[17, 4, 13, 3, 4, 25, 21, 14, 20, 18, 0, 19, 13, 14, 14, 13, 0, 0] 


Before enciphering this message, we must group the plaintext ring elements 
into row vectors with the same number of positions as the number of rows 
of A. To do this, we first assign the number of rows of A as blocksize. 


> blocksize := rowdim(A); 
blocksize := 3 
Next, we assign the number of row vectors into which we will group the 
plaintext ring elements as numblocks. 


> numblocks := vectdim(ptext)/blocksize; 
numblocks := 6 
By entering the following matrix command, we group the plaintext ring 


elements from the vector ptext into row vectors of length blocksize and 
place these row vectors in order as rows in the matrix pmatrix. 


> pmatrix := matrix(numblocks, blocksize, ptext); 
17 4 13 
3 4 25 
pmatriz := oT ae ee 
18 0 19 
13 14 14 
13 0 0 


Because pmatrix contains all of the plaintext row vectors, we can find all of 
the ciphertext row vectors at once by multiplying pmatrix by A. By entering 
the following command, we compute this product and define the result as 
cmatrix. Note that in this command we use the Maple map procedure to 
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reduce the entries in cmatrix modulo 26. Note also that we use &* for the 
matrix multiplication, and evalm to display the result. 


> cmatrix := map(m -> m mod 26, evalm(pmatrix &* A)); 
5 10 23 
9 4 19 
cmatriz := a 
4 4 3 
11 16 10 
13 0 0 


We can use the Maple convert command as follows to list the rows in 
cmatrix in order as the vector ctext. 


> ctext := convert(cmatrix, vector); 
ctext := [5, 10, 23, 9, 4, 19, 9, 12, 24, 4, 4, 3, 11, 16, 10, 13, 0, 0] 


The preceding output shows the ciphertext expressed as a list of ring ele- 
ments. By entering the following loop, we convert this list of ring elements 
into a list of alphabet letters. 


> for i from 1 to vectdim(ctext) do 
> ctext[i] := letters[ ctext[i] ]: 
> od: 


> evalm(ctext) ; 
[F, K, X, pe J, M, Y, E, E, D, L, Q, K, N, A, A] 
Thus, the resulting ciphertext is “FKXJETJMYEEDLQKNAA”. 


To decipher this message, we would begin by defining the ciphertext 
as the vector ctext, and defining letters, ltable, A, blocksize, and 
numblocks as before (defining numblocks as the number of ciphertext char- 
acters divided by blocksize). We would then need to convert the list of 
ciphertext letters into a list of ring elements. We can do this by entering 
the following commands. 


> for i from 1 to vectdim(ctext) do 
> ctext[i] := ltable[ ctext[i] ]: 
> od: 


> evalm(ctext) ; 


[5, 10, 23, 9, 4, 19, 9, 12, 24, 4, 4, 3, 11, 16, 10, 13, 0, 0] 
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Next, we would need to group these ciphertext ring elements into row 
vectors of length blocksize. We can do this by entering the following 
command, which leaves these row vectors in order as rows in the matrix 
cmatrix. 


> cmatrix := matrix(numblocks, blocksize, ctext); 
5 10 23 
9 4 19 
cmatriz := as 
4 4 8 
11 16 10 
13 0 0 


We can then recover the plaintext ring elements by multiplying the preced- 
ing matrix by the inverse of the key matrix A. We can do this by entering 
the following command, which leaves the resulting product as the matrix 
pmatrix. (Note that to obtain the inverse of A, we must only raise A to the 
power -1.) 


> pmatrix := map(m -> m mod 26, evalm(cmatrix &* A^`(-1))); 
17 4 13 
3 4 25 
pmatriz := ote eee 
18 0 19 
13 14 14 
133 0 0 


We can list the plaintext ring elements in order in the vector ptext by 
entering the following command. 


> ptext := convert (pmatrix, vector); 
ptext := |17, 4, 13, 3, 4, 25, 21, 14, 20, 18, 0, 19, 13, 14, 14, 13, 0, 0] 


Finally, we can see the corresponding alphabet letters by entering the fol- 
lowing commands. 
> for i from 1 to vectdim(ptext) do 
> ptext[i] := letters[ ptext[i] ]: 
> od: 
> 


evalm(ptext) ; 


[R, E, N, D, E; Z, V, O, U, 5; A, T, N, O, O, N, A, A] 
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6.4 Generalizations of the Hill Cryptosystem 


Just as we enhanced the security of our first encryption method in Section 
6.1 by including a constant in the function f : R — R, we can enhance the 
security of the Hill encryption method by including a vector of constants 
in the function f : R” — R”. We summarize this as our next encryption 
method. 


Generalized Hill Encryption Method: Let A be an n x n invertible 
matrix over R, and let B be a row vector of length n over R. Group the 
plaintext into row vectors P; of length n, and define f : R” — R” by 
f(Pi) = P;A + B with each entry taken modulo |R|. The resulting rows 
listed together form the ciphertext. 


The generalized Hill encryption method yields a system that has the 
matrix A and vector B as keys. The inclusion of B obviously yields a 
higher level of security than our original Hill encryption method. To break 
the generalized system by trial and error, an intruder would have to test 
a maximum of 26" +” possible pairs (A, B) of keys, while to break the 
original Hill system an intruder would have to test a maximum of only 26” 
possible keys. Even for very small values of n this increase is significant. 
For example, for n = 5, the generalized system has 26° = 11881376 times as 
many possible pairs of keys as the number of possible keys for our original 
Hill system. 


To decipher a message that has been encrypted using the generalized 
Hill encryption method with a key matrix A of size n x n and vector B 
of length n, we would group the ciphertext into row vectors C; of length 
n, and compute f~'(C;) = (Ci — B)A~! mod |R] for all i. Hence, despite 
the somewhat significant increase in security yielded by the generalized Hill 
encryption method, there is not a significant increase in the computational 
work necessary for enciphering and deciphering messages. 


We now discuss an extension of the generalized Hill encryption method 
that yields an even higher level of security. 


Variable Matrix Encryption Method: Let A be an n x n invertible 
matrix over R, and let B; be varying row vectors of length n over R. Group 
the plaintext into row vectors P; of length n, and define f; : R” — R” by 
fi(P;) = P,A + B; with each entry taken modulo |R|. The resulting rows 
listed together form the ciphertext. 
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To decipher a message that has been encrypted using the variable 
matrix encryption method with a key matrix A of size n x n and vectors 
B; of length n, we would group the ciphertext into row vectors C; of length 
n, and compute f; '(C;) = (Ci — B;)A7! mod |R] for all i. 


One problem with the variable matrix encryption method is that in 
practice it may be difficult or cumbersome for the originator and intended 
recipient of the messages to keep a record of the vectors B;. To avoid this 
problem, these vectors can be chosen so that they depend uniquely on the 
plaintext vectors P;, the ciphertext vectors C;, or the previous B;. For 
example, three simple methods for choosing the B; are 


1. B; = P;-1 B, where B is a fixed n x n matrix over R and Py is given, 
2. Bi = C;_,B, where B is a fixed n x n matrix over R and Co is given, 


3. By = (ri, Tigi,---,Ti+n—1), where {r;} is a recursive sequence over R 
with necessary initial rj given. 


Example 6.6 In this example we use the variable matrix encryption 
method to encipher the message “MEET AT SEVEN”. We begin by convert- 
ing the message into a list of elements in Z26. The result of this conversion 
is shown at the beginning of Example 6.3. We will use the following key 
matrix A to encipher this message. 


1 
A=} 3 
0 


Dor bw 


1 
0 
1 


And we will use the first of the three methods listed above for choosing the 
vectors B; with the following matrix B 


1 0 0 
B = 1 1 1 
0 0 1 


and vector P) = (1,2,3). To form the ciphertext, we group the plaintext 
into row vectors P; of length 3, and compute P; A + B; for all i with each 
entry taken modulo 26. Before constructing the first ciphertext vector, we 
construct the vector Bı as follows. 


Bı = PB = (1,2,3)B = (3,2,5) 
We can then construct the first ciphertext vector as follows. 
P, A+B, = (12,4,4)A + (3,2,5) 
= (27,38,21) 
= (1,12,21) 
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We construct the remaining vectors B; as follows. 


By = PB = (12,4,4)B = (16,4,8) 
B> = PB = (19,0,19)B = (19,0,19) 
By = PB = (18,4,21)B = (22,4,25) 


And we can then construct the remaining ciphertext vectors as follows. 


P>A+B> = (19,0,19)A+(16,4,8) = --- = (9,2,20) 
P3A+ Bz; = (18,4,21)A+(19,0,19) = --- = (23,4,6) 
P,A+ By, = (4,13,25)A+(22,4,25) = -> = (13,23,2) 


Hence, the entire encipherment is 


M E E T A S E V E N 
a > 12 4 4 19 0 19 18 4 21 4 123 25 
f => 1 12 2 9 2 20 23 4 6 13 23 2 
ats B M V J C U X E G N X C 
and the ciphertext is “BMVJCUXEGNXC”. a 


6.5 The Two-Message Problem 


Recall that the Hill cryptosystem can be used to obtain any desired level of 
security by simply choosing a key matrix that is sufficiently large. However, 
recall also that for the Hill system with an n x n key matrix, we discussed 
a technique in Section 6.2 by which an intruder may be able to break the 
system relatively easily if the intruder knows or is able to correctly guess n? 
characters from the plaintext. This technique illustrates the general fact 
that it is sometimes possible for an intruder to break a cryptosystem in 
an unusual way provided the intruder knows some additional information 
about the system (such as, for example, n? characters from the plaintext). 
In this section we discuss a technique by which an intruder can break a 
slight modification of the Hill system in an unusual way. We first discuss 
the modification of the system, which consists of using a key matrix that is 
involutory. A matrix K is said to be involutory if K? = I (ie. if K = K~'). 


Modified Hill Encryption Method: Let K be an n x n involutory 
matrix over R. Group the plaintext into row vectors P; of length n, and 
form ciphertext vectors C; by C; = P;K with each entry taken modulo |R]. 
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This modified Hill encryption method is in fact the method suggested 
by Hill when he first presented his cryptosystem in 1929. The reason for 
Hill’s suggestion of using an involutory key matrix is obvious, for then the 
same matrix could be used to decipher a message that was used to enci- 
pher the message. Although this simplification of the Hill system is not as 
significant now with the recent developments in calculators and computers, 
the elimination of having to determine the inverse of the key matrix was 
very noteworthy in 1929. In fact, Hill even invented a machine designed 
to perform the calculations in his cryptosystem, and argued that by using 
an involutory key matrix one could both encipher and decipher a single 
message without “changing the settings”. 


Note that in order for the modified Hill encryption method to be a 
useful encryption technique, there should be a relatively large number of 
n x n involutory matrices for each n > 1. For any specific n > 1, if there 
are only a relatively few n x n involutory matrices, then an involutory key 
matrix of size n x n would certainly not yield a secure cryptosystem. We 
would assume in general that an intruder to such a system would know 
the size of the key matrix and the fact that the key matrix was involutory. 
Hence, if there are only a relatively few involutory matrices of size 
nxn, an intruder could break the system very easily by simply testing each 
one. However, it is not the case that there are only a relatively few n x n 
involutory matrices for any n > 1. It has been well-known for many years 
that for any matrices A of size r x s and B of size s x r over a ring R, the 
block matrix 


BA-I B 
2A—ABA I—AB 


is involutory over R (see Written Exercise 7). Thus, it is not unreason- 
able to suppose that an intruder should find the modified Hill system not 
significantly less difficult to break than the usual Hill system. But, as men- 
tioned, it is sometimes possible for an intruder to break a cryptosystem in 
an unusual way provided the intruder knows some additional information 
about the system. As we discuss next, an intruder can break the modified 
Hill system in an unusual and relatively easy way provided the intruder 
intercepts two ciphertexts formed from the same plaintext using different 
involutory key matrices of the same size. In this scenario, the problem of 
breaking the system is called the Two-Message problem. 


Suppose we intercept ciphertexts C, C” formed from the same plaintext 
P using the Hill cryptosystem with two different n xn involutory key matri- 
ces K, K’. That is, suppose a single plaintext P is grouped into row vectors 
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P; of length n, and we intercept ciphertext vectors C; and C; formed by 
C, = PK (6.4) 
and 
C = PK (6.5) 


for all i where K, K’ are distinct n x n involutory matrices. The Two- 
Message problem is to determine the plaintext vectors P; for all i from 
knowledge of the ciphertext vectors C; and C}. Note that since K and 
K’ are involutory, they are their own corresponding decryption matrices. 
Thus, if we could determine K or K’ we would be done. To do this, note 
that because K is involutory, (6.4) is equivalent to 


P, = C.K (6.6) 


for all i. By substituting this expression for P; into (6.5), we obtain 


C; = CGKK' (6.7) 
for all i. If there are n values of i, say 71, 72,...,%, for which the nxn matrix 
S=[Ci,,Ci,,...,Ci,,] is invertible, then we can determine the matrix KK’ 


as follows. Define the nxn matrix T = iC fe eee Ci: Since (6.7) holds 
for every i, it follows that T = SKK’. Hence, we can determine KK’ as 
KK' = STIT. Note that in order for the matrices S and T to exist, the 
message P must be at least n? characters in length. Note also that T will 


be invertible since T7! can be expressed as T~! = K'K S71. 


Recall, as we stated above, if we can determine K or K’, then the 
Two-Message problem will be solved. However, as we have just shown, 
with a very mild assumption we should not have any difficulty in finding 
the matrix KK’. After finding KK’, we consider the matrix equation 
(KK')X = X(KK’')~! or, equivalently, 


(KK')X = X(K'K) (6.8) 


for unknown matrix X. Note that both K and K’ are involutory solutions 
to (6.8). Thus, if we find all of the involutory solutions to (6.8), the resulting 
collection of matrices will include both K and K’. To find the plaintext, we 
must then only decipher one of the ciphertexts with each of the involutory 
solutions to (6.8). Most likely only one of these decipherments, the correct 
plaintext, will make any sense. (To save time, we can decipher only a 
portion of one of the ciphertexts with each of the involutory solutions to 
(6.8). This will reveal which of the involutory solutions to (6.8) is the 
correct key matrix for that ciphertext. We can then use this key matrix to 
decipher the rest of the ciphertext.) 
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We now summarize the complete solution process for the Two-Message 
problem as follows. 


1. Determine the matrix KK”. 
2. Find all of the involutory solutions to (6.8). 


3. Decipher one of the ciphertexts with each of the involutory solutions 
to (6.8). The correct key matrix, which is one of these involutory 
solutions, will yield the correct plaintext. 


The procedure for completing the first step in this solution process was 
described previously, and can generally be done in a straightforward man- 
ner. The calculations for the third step can also generally be done in a 
straightforward manner, although due to the potentially large number of 
involutory solutions to (6.8), these calculations can be long and tedious. It 
is in the second step of this solution process that the essential difficulties 
of the Two-Message problem lie. This step can be considered in two parts. 
First, we can determine the general solution X to (6.8) by solving a system 
of n? linear equations for the unknown elements in X. After finding this 
general solution, we can find the involutory solutions to (6.8) by imposing 
the condition X? = I on the general solution X. This requires solving 
a system of up to n? quadratic equations, hence providing many possible 
difficulties, especially for large n. However, the potential difficulties that 
can be incurred in solving a system of up to n? quadratic equations do not 
compare (time-wise) to those incurred in breaking the modified Hill system 
by using trial and error to determine the key matrix. 
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Written Exercises 


1. Julius Caesar was known to encipher messages using our second en- 
cryption method from Section 6.1 with a = 1. This encryption 
method yields what is often called a shift cipher. Use a shift cipher 
to encipher the message “ATTACK AT DAWN”. Also, describe and 
illustrate procedures the intended recipient could use to decipher the 
message and an intruder could use to break the system. Explain why 
it is natural to say that this encryption method yields a shift cipher. 


2. Consider the following matrix A over Zə6. 


13 7 
tsi A 


(a) Use the Hill encryption method with the key matrix A defined 
above to encipher the message, “SEND TARGET STATUS”. 


(b) Decipher “NDJLWLTBWFVXGSNV”, a message that has been 
enciphered using the Hill encryption method with the key matrix 
A defined above. 


3. Suppose you intercept “FLBIPURCRGAO”, a message that has been 
enciphered using the Hill encryption method with a 2 x 2 key matrix 
A over Zg, and you somehow know that the first four letters in the 
corresponding plaintext are “NCST”. Decipher the message. 


4. (a) Use the generalized Hill encryption method with the key matrix 
A from Written Exercise 2 and the vector B = (1,2) over Z26 to 
encipher the message, “ABORT MISSION”. 


(b) Decipher “UXSJOEWNOJHE”, a message that has been enci- 
phered using the generalized Hill encryption method with the 
key matrix A from Written Exercise 2 and the vector B = (1, 2) 
over 26. 


5. Consider the following matrices A and B and vector Co over Z 6. 
2 5 1 0 
a-[2 8] efi] anos 


(a) Use the variable matrix encryption method with the key matrix 
A defined above to encipher the message, “NEED BACKUP”. 
For choosing the vectors B;, use the second of the three methods 
listed immediately before Example 6.6 with the matrix B and 
vector Co defined above. 
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(b) Decipher “RTOWKRPTLS”, a message that has been enciphered 
using the variable matrix encryption method with the key matrix 
A defined above. The vectors B; were chosen using the second of 
the three methods listed immediately before Example 6.6 with 
the matrix B and vector Co defined above. 


6. Consider the recursive sequence {r;} over Z2 given by 
542 = (rj44 + rj) mod 26 
with ry = 3 and rg = 5. 


(a) Use the variable matrix encryption method with the key 
matrix A from Written Exercise 5 to encipher the message, 
“GO PACK”. For choosing the vectors B;, use the third of the 
three methods listed immediately before Example 6.6 with the 
recursive sequence {r;} defined above. 


(b) Decipher “JAYGKI”’, a message that has been enciphered using 
the variable matrix encryption method with the key matrix A 
from Written Exercise 5. The vectors B; were chosen using the 
third of the three methods listed immediately before Example 
6.6 with the recursive sequence {r;} defined above. 


7. Consider matrices A of size r x s and B of size s x r over a ring R. 
Find the size of the matrix 


BA-I B 
2A — ABA I— AB 


and show that this matrix is involutory over R. 


8. Use the result from Written Exercise 7 to construct an involutory 
matrix of size 3 x 3 over Z26. Then use your result as the key matrix 
K in the modified Hill encryption method to encipher a plaintext of 
your choice with at least six characters. Also, show how to decipher 
the resulting ciphertext. 


9. Let A be a matrix of size n x n over Z;,. Recall that 
A (adj A) = (adj A) A = (det A) I 


where adj A represents the adjoint of A and I is the n x n identity. 
Show that A is invertible over Z;, if and only if the determinant of A 
has a multiplicative inverse in Z,. 


10. Show that a € Z, has a multiplicative inverse in Z, if and only if 
(a,k) =1. 
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Maple Exercises 


1. Consider the following matrix A over Zə6. 


9 10 0 20 7 
4 3 14 23 16 
A= 7 2 5 7 5 
21 1 25 3 1 
1 5 4 3 0 


(a) Use the Hill encryption method with the key matrix A defined 
above to encipher the message, “ABORT MISSION PROCEED 
WITH SECONDARY ORDERS”. 

(b) Decipher “ZQBTDDBIGZZDCQFRXFBXPVJERZRSBA”, a mes- 
sage that has been enciphered using the Hill encryption method 
with the key matrix A defined above. 


2. Consider the following matrix B and vector Po over Zə6. 


13 22 4 4 3 
2 0 4 6 8 


B=] 1 25 17 23 9 Po = (1,1,1,1,1) 
3 2 6 3 12 
7 4 5 3 12 


(a) Use the variable matrix encryption method with the key matrix 
A from Maple Exercise 1 to encipher the message, “ATTACK 
FLANK AT SUNRISE”. For choosing the vectors B;, use the first 
of the three methods listed immediately before Example 6.6 with 
the matrix B and vector Po defined above. 


Decipher “ZRLGVCKZHWLMOSHXOGBU”, a message that has 
been enciphered using the variable matrix encryption method 
with the key matrix A from Maple Exercise 1. The vectors B; 
were chosen using the first of the three methods listed imme- 
diately before Example 6.6 with the matrix B and vector Pp 
defined above. 


rm, 
z 


3. Use the result from Written Exercise 7 to construct an involutory 
matrix of size 5 x 5 over Zag. (You may find the Maple blockmatrix 
command useful in constructing this matrix.) Then use your result as 
the key matrix K in the modified Hill encryption method to encipher 
a plaintext of your choice with at least 15 characters. Also, show how 
to decipher the resulting ciphertext. 
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Chapter 7 


The RSA Cryptosystem 


In this chapter, we discuss one of the most well-known and popular cryp- 
tosystems ever developed — the RSA cryptosystem. One reason why it is 
so well-known and popular is that it is a classic public-key system. Re- 
call in general that everything in a cryptosystem is assumed to be public 
knowledge except the parameters in the enciphering function. A public-key 
system is one in which even these parameters can be public knowledge with- 
out compromising the security of the system. That is, using the notation 
introduced in Section 6.1, a public-key system is one in which the function 
f can be public knowledge without revealing f~'. The RSA cryptosystem 
is named for R. Rivest, A. Shamir, and L. Adleman, who first published it 
in 1978. 


Before formally presenting the RSA system, we consider a very simple 
example of the mathematics that govern it. Choose primes p = 5 and 
q = 11, and let n = pq = 55 and m = (p—1)(q—1) = 40. Next, 
choose a = 27, chosen so that (a,m) = 1, and let b = 3, chosen so that 
ab = 1 mod m. Then for x = 2, note that 


(a) = (2?7)3 = 2417851639229258349412352 = 2 mod 55 = x mod n. 
An important thing to note about this computation is that 
zc = xmodn. (7.1) 


In fact, this equation will be true for any x € Z because a and b were chosen 
so that 

ab = 1mod™m. (7.2) 
Thus, if we encipher a message by raising the plaintext to the power a, 
we can decipher the message by raising the ciphertext to the power b and 
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reducing modulo n. It is certainly not obvious that (7.1) will hold for any 
x € Z provided (7.2) is true. Establishing this result will be one of our 
primary goals in Section 7.1. 


7.1 Mathematical Prerequisites 


Before establishing the fact that (7.1) will hold for any x € Z provided 
(7.2) is true, we discuss some additional preliminaries. We first discuss how 
to find values of a and b that satisfy (7.2). Of course, it is not difficult to 
choose a, as it must only be relatively prime to m. Once a is chosen, we 
can then find b as the multiplicative inverse of a modulo m by constructing 
the Euclidean algorithm table (see Section 1.6) for a and m. We illustrate 
this in the following example. 


Example 7.1 Consider a = 27 and m = 40. To find a value for b that 
satisfies ab = 1 mod m, we first apply the Euclidean algorithm to a and m 
as follows. 


40 = 27-1418 
27 = 13-241 


Note that, as required, (a,m) = 1. Constructing the Euclidean algorithm 
table for a and m, we obtain the following. 


Row Q R U V 
—1 = 40 1 0 
0 F 27 0 1 
1 1 13 1 = 
2 2 1 —2 


Hence, 40(—2) + 27(3) = 1. This immediately gives the result, for it states 
that 27(3) = 1 mod 40, and thus b = 3 satisfies ab = 1 mod m. ] 


Next, we show the general relationship between the values of n and m 
in the example in the introduction to this chapter. To do this we must first 
prove some general results about the ring of integers. 


Let n be an integer with n > 1. Then the modular ring Zn 
inherits many properties from Z since it is a quotient ring of Z. 
Consider the set U, of units in Z,. That is, consider the set 
Un = {x € Zn | x has a multiplicative inverse in Zn}. Note that U» can 
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also be expressed as U, = {x € Zn | (a,n) = 1}, and that U, forms a 
multiplicative group (see Written Exercise 11). The order of U, is denoted 
in general by y(n). The function ¢ is called the Euler-phi function. 


Theorem 7.1 Ifp is a prime, then Up = Z} and (p) =p—1. 
Proof. Exercise. E 


Theorem 7.2 Suppose a and b are integers with (a,b) = 1. Then 
plab) = p(a)p(b). 


Proof. Consider the sets A = {s | 0 < s < a and (s,a) = 1} and 
B={r|0<r< band (r,b)= 1}. Then |A| = y(a) and |B| = (b). Now, 
let T = {ar +bs |r € Bands € A}. We claim that (ar + bs,ab) = 1. 
Suppose there exists a prime p that divides both ar + bs and ab. Then 
pla or p|b. If pla, then plar + bs implies p|bs. But since (a,b) = 1, then 
p does not divide b. Hence, p|s. But (s,a) = 1, which is a contradic- 
tion. A similar argument holds if p|b. Thus, (ar + bs,ab) = 1. Sup- 
pose now that ar + bs = ar’ + bs’ mod ab with r,r’ € B and s,s’ € A. 
Then a(r — r’) = b(s' — s) mod ab, and bla(r — r’). Hence, 6|(r — r’) 
since (a,b) = 1. But 0 < r andr’ < b. Therefore, r = r’. Similarly, 
s = s’. Let U = {w | w is the remainder when ar + bs is divided by ab}. 
Each element in U is relatively prime to ab, and |U| = |A x B|. Hence, 
(ab) > |U| = |A x B| = y(a)y(b). To show the desired equality, it is now 
sufficient to show that if c € Z with (c,ab) = 1 and 0 < c < ab, then c € U. 
Since (a,b) = 1 and ag + by = 1 for some x,y € Z, then azc + byc = c. Let 
z = xc and t = yc. Then az + bt = c. Since (a,c) = 1, it then follows that 
(a,t) = 1. Thus, t = s moda for some s € A, and bt = bs mod ab. In a 
similar manner it can be seen that az = ar mod ab for some r € B. Hence, 
ar + bs = c mod ab, and c € U. | 


The reason Theorems 7.1 and 7.2 are of interest to us is because of 
the following corollary, which states the general relationship between the 
values of n and m in the example in the introduction to this chapter. 


Corollary 7.3 Suppose n = pq for distinct primes p and q. Then 
y(n) = (p—1)(q- 1). 


Proof. Exercise. E 
We now show that (7.1) will hold for any x € Z provided (7.2) is true. 
The main result we will use to show this is the following theorem, commonly 


called Fermat’s Little Theorem. 
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Theorem 7.4 Let p be a prime, and suppose x € Z satisfies (x, p) = 1. 
Then x?-+ = 1 mod p. 


Proof. We claim first that the elements in the set 
Sı = {x mod p, 2x mod p,..., (p — 1)x mod p} 
are a rearrangement of the elements in the set 
S2 = {1,2,...,p— 1}. 


To see this, note that if jx mod p = kx mod p for some positive integers j 
and k less than p, then p|(j—)a. But since p does not divide x, this implies 
that p|j — k. Thus, because j and k are less than p, it follows that j = k. 
Now, since S1 = S2, the product of the elements in Sı will be equal to the 
product of the elements in S2. That is, x?~!(p—1)! mod p = (p—1)! mod p. 
Hence, p|(p — 1)!(a?~! — 1). Finally, since p does not divide (p — 1)!, then 
p\z?—1 — 1 or, equivalently, 2?~t = 1 mod p. a 


In the following theorem we establish the fact that (7.1) will hold 
provided (7.2) is true. 


Theorem 7.5 Suppose p and q are distinct primes and define n = pq and 
m = y(n) = (p—1)(q—-1). Ifa and b are integers that satisfy ab = 1 mod m, 
=amodn for all x € Z. 


Proof. Since ab = 1 mod m, then ab = 1 + km for some k € Z. Hence, for 
any x € Z, it follows that 


go = gitkm = a(x”) = a(x?—1)RQ-V) 
If (x,p) = 1, then by Theorem 7.4 we know that x?~! = 1 mod p. Thus, 
a = ¢(1)*9-) mod p = x mod p. Also, if (x,p) Æ 1, then x = 0 mod p, 
and certainly z% = x mod p. Similarly, x°? = x mod q. Hence, p|(x%è — x) 


and q|(x® — x), and thus pq|(2® — x). That is, n|(x*° — x) or, equivalently, 
x = x mod n. | 


7.2 RSA Encryption and Decryption 


To encipher a message using the RSA cryptosystem, we first convert the 
message into a list of nonnegative integers by applying a mapping like the 
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correspondence a from Chapter 6. We then choose distinct primes p and q 
and define n = pq and m = y(n) = (p — 1)(q— 1). Next, we choose a € Zž, 
such that (a,m) = 1, and find b € Zž, that satisfies ab = 1 mod m. (Recall 
that b can be found by constructing the Euclidean algorithm table for a 
and m.) To encipher the message, we form ciphertext integers by raising 
the plaintext integers to the power a and reducing modulo n. According 
to Theorem 7.5, we can then recover the plaintext integers by raising the 
ciphertext integers to the power b and reducing modulo n. 


Example 7.2 In this example, we use the RSA cryptosystem to encipher 
and decipher the message, “NCSU”. We first apply the correspondence a 
from Chapter 6 to convert this message into the list of integers 13 2 18 
20. Next, we choose primes p = 5 and q = 11, and define n = pq = 55 and 
m = (p — 1)(q — 1) = 40. We then choose encryption exponent a = 27. To 
encipher the message, we perform the following calculations. 


1377 = 7mod55 

227 = 18mod55 
18?7 = 17mod 55 
2077 = 15 mod 55 


Hence, the ciphertext is the list of integers 7 18 17 15. (Although we could 
use œT! to convert this particular list of integers back into a list of letters, 
conversion of an RSA ciphertext back into a list of alphabet characters is 
not usually possible. To see this, note that because the results of these 
encryption calculations were reduced modulo n = 55, these results could 
have been as large as n—1 = 54.) By Example 7.1, the decryption exponent 
that corresponds to the encryption exponent a = 27 in this example is 
b = 3. Hence, to decipher the message, we must only perform the following 
calculations. 


73” = 13 moed:55 
188 = @mod55 
179 = 18 mod 55 
158 = 20 mod 55 
Note that the results are the original plaintext integers. E 


We still have several topics to address regarding the RSA cryptosystem. 
Note first that no matter how large we choose the encryption exponent 
and modulus for the RSA system, the system as illustrated in Example 
7.2 will certainly not be secure because it will just yield a substitution 
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cipher. However, we can use the RSA encryption procedure as presented to 
obtain a non-substitution cipher by simply grouping consecutive integers 
in the plaintext before enciphering. Because our exponentiation operations 
are done modulo n, we will still be able to convert between plaintext and 
ciphertext uniquely, provided the plaintext integers are grouped into blocks 
that are smaller than n. We illustrate this in the following example. 


Example 7.3 In this example, we again encipher and decipher the mes- 
sage, “NCSU”. We begin by choosing primes p = 79 and q = 151, and 
defining n = pq = 11929 and m = (p — 1)(q — 1) = 11700. Next, we 
choose a = 473, chosen so that (a,m) = 1. We can then use the Euclidean 
algorithm to find that b = 8237 satisfies ab = 1 mod m. Recall that our 
plaintext converts to the list of integers 13 2 18 20. Since we have cho- 
sen a 5-digit value for n, we can group the first two and last two plaintext 
integers into blocks that will be smaller than n. That is, we can express 
the plaintext as 1302 1820 (note that we use 02 for 2), and use the RSA 
encryption procedure as presented. To encipher the message, we perform 
the following calculations. 


1302473 7490 mod 11929 
1820473 = 9723 mod 11929 


Il 


Hence, the ciphertext is 7490 9723. To decipher the message, we perform 
the following calculations. 


74908237 1302 mod 11929 
97238237 = 1820 mod 11929 


II 


We can then split the resulting 4-digit integers into the original 2-digit 
plaintext integers. E 


Another topic we must address regarding the RSA cryptosystem is how 
the system actually progresses between two people wishing to exchange a 
secret message. We stated in the introduction to this chapter that the RSA 
cryptosystem is a public-key system. This forces the system to progress in 
a particular way. 


Recall that in general we assume almost everything in a cryptosys- 
tem is public knowledge, including the form of the enciphering function. 
This means that we would assume an intruder who intercepts an RSA ci- 
phertext would know that each ciphertext integer was formed as x° mod n 
for some plaintext integer x and positive integers a and n. The fact that 
the RSA cryptosystem is a public-key system means that we would as- 
sume the intruder also knows the actual values of a and n used in the 
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encryption. For example, we would assume an intruder who intercepts the 
ciphertext in Example 7.3 would know that each ciphertext integer was 
formed as x47? mod 11929 for some plaintext integer x. Although this ob- 
viously affects the security of the system, we make this assumption because 
in practice the RSA system is used with a and n being public knowledge. 
The benefit of this is that two people wishing to use RSA to send a secret 
message across an insecure line of communication do not have to figure out 
a way to secretly exchange an encryption exponent and modulus. 


The comments made in the previous paragraph imply that the RSA 
system in Example 7.3 is not mathematically secure. This is because an 
intruder could mathematically break the system as follows. After find- 
ing the values of p and q in n = pq = 11929, an intruder could form 
m = (p—1)(q — 1), use the Euclidean algorithm to find that b = 8237 
satisfies ab = 1 mod m, and decipher the message by raising the ciphertext 
integers to the power b and reducing modulo n. Hence, none of the opera- 
tions necessary to break this system would take an intruder more than a few 
minutes. And even with significantly larger numbers, the Euclidean algo- 
rithm and modular exponentiation can easily be efficiently programmed on 
a computer. However, the first step in this process requires an intruder to 
find the two prime factors of n. It is the apparent difficulty of this problem, 
provided p and q are very large, that gives the RSA system an extremely 
high level of security. For example, if p and q are both around 100 digits 
long, the fastest known factoring algorithms would generally take millions 
of years to factor n = pq, even when programmed on a computer that can 
perform millions of operations per second. (We make some comments on 
choosing very large prime numbers in Sections 7.3 and 7.5, and some com- 
ments on factoring numbers with very large prime factors in Sections 7.3 
and 7.6.) Hence, even if the encryption exponent a is public knowledge, 
an intruder should not be able to determine the decryption exponent b. 
This is precisely why the RSA cryptosystem is called a public-key system — 
the parameters in the enciphering function f(x) = x° mod n can be public 
knowledge without revealing the parameter b in the deciphering function 
f-\(a) = 2° mod n. 


We now mention how the RSA system actually progresses between 
two people wishing to exchange a secret message across an insecure line of 
communication. Because only the intended recipient of the message must 
be able to decipher the message, the intended recipient of the message 
initiates the process by choosing primes p and q, and defining n = pq 
and m = (p—1)(q—1). The intended recipient then chooses an encryption 
exponent a € Z*, such that (a,m) = 1 and, using the Euclidean algorithm if 
necessary, finds b € Z, that satisfies ab = 1 mod m. The intended recipient 
then sends the values of a and n to the originator of the message across the 
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insecure line of communication, forcing the assumption that a and n are 
public knowledge. The originator of the message enciphers the message by 
applying the function f(x) = x* mod n to the plaintext integers, and then 
sends the resulting ciphertext integers to the intended recipient across the 
insecure line of communication. Since only the intended recipient knows 
b, only the intended recipient can decipher the message by applying the 
function f~!(x) = x? mod n to the ciphertext integers. 


Example 7.4 Suppose we wish to use the RSA cryptosystem to send the 
message, “NCSU” to a colleague across an insecure line of communication. 
Our colleague begins the process by choosing primes p and q, and defining 
n = pq = 363794227 and m = (p — 1)(q — 1). Next, our colleague chooses 
a = 13783, chosen so that (a,m) = 1, and uses the Euclidean algorithm 
to find b € Zž, that satisfies ab = 1 mod m. Our colleague then sends the 
values of a and n to us across the insecure line of communication. Recall 
that our plaintext converts to the list of integers 13 02 18 20. Since 
our colleague has chosen a 9-digit value for n, we can group all four of 
these 2-digit plaintext integers into a single block that will be smaller than 
n. That is, we can express the plaintext as 13021820, and encipher our 
message by applying the function f(x) = x° mod n to this plaintext integer. 
To encipher the message we perform the following calculation. 


1302182013783 = 91518013 mod 363794227 


We would then transmit the ciphertext integer 91518013 to our colleague 
across the insecure line of communication. In order for an intruder who 
intercepts this ciphertext and the previously transmitted values of a and 
n to decipher the message, the intruder would need to find the decryption 
exponent b. But to find 6, an intruder would first need to find m. And to 
find m, an intruder would need to find the prime factors of n, a problem 
that, as we have stated, is essentially impossible provided our colleague has 
chosen sufficiently large values for p and q. This would not pose a problem 
for our colleague, however, because our colleague began the process by 
choosing p and q. Hence, our colleague would know that the prime factors 
of n = pq = 363794227 are p = 14753 and q = 24659, and would have 
no difficulty in forming m = (p — 1)(q — 1) = 363754816 and using the 
Euclidean algorithm to find that b = 20981287 satisfies ab = 1 mod m. To 
decipher the message, our colleague would then only need to perform the 
following calculation. 


915180137098!287 — 13021820 mod 363794227 


(We make some comments on efficiently raising large numbers to large 
powers in Sections 7.3 and 7.4.) | 
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7.3 The RSA Cryptosystem with Maple 


In this section we show how Maple can be used to encipher and decipher 
messages using the RSA cryptosystem. 


We begin by mentioning several Maple commands that are useful for 
finding large primes. The first command we will mention is the nextprime 
command, which returns the smallest prime larger than an integer input. 
For example, the following command returns the smallest prime larger than 
400043344212007458000. 

> nextprime (400043344212007458000) ; 


400043344212007458013 
A similar command is the prevprime command, which returns the largest 
prime smaller than an integer input. For example, the following command 


returns the largest prime smaller than 400043344212007458000. 
> prevprime (400043344212007458000) ; 


400043344212007457977 
A final primality command we will mention is the isprime command, which 
returns true if an integer input is prime and false if not. For example, 
the following commands imply that 400043344212007457977 is prime while 


400043344212007458000 is not. 
> isprime (400043344212007457977) ; 


true 
> isprime (400043344212007458000) ; 


false 


We should mention that the nextprime, prevprime, and isprime com- 
mands are probabilistic routines that employ primality tests (see Section 
7.5). This means that the output returned by Maple is in general guaran- 
teed to be correct with extremely high probability, but not absolutely. 


We now show how Maple can be used to perform the RSA encipherment 
and decipherment procedures. We begin by finding large primes p and q. 
> p := nextprime (400043344212007458000) ; 


p := 400043344212007458013 


> q := nextprime (500030066366269001200) ; 


q := 500030066366269001203 
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Next, we define n = pq and m = (p — 1)(q — 1). 
> n := p*q; 


n := 200033699955714283345172521584008468989639 


> m := (p-1)*(q-1); 
m := 200033699955714283344272448173430192530424 


And we will use the following encryption exponent a. 


> a := 10098768900987679000910003; 
a := 10098768900987679000910003 


To verify that this value of a is a valid RSA encryption exponent, we enter 
the following Maple igcd command, which returns the greatest common 
divisor of the integers a and m. Note that, as required, (a,m) = 1. 


> igcd(a, m); 


1 


We now use the RSA encipherment procedure to encipher the message, 
“RETURN TO HEADQUARTERS”. (Because the letter “I” represents 
y—I in Maple, the user-written procedures that follow had to be designed 
for messages expressed with lower-case letters. Also, contrary to the way we 
defined messages in Section 6.3, note that the following message is defined 
as a string of letters without spaces rather than as a vector containing the 
letters.) 


> message := ‘returntoheadquarters‘ ; 


message := returntoheadquarters 


Next, we convert this message into a list of 2-digit integers and combine 
these integers into a single block. To do this, we have provided the user- 
written procedure to number, for which code is given in Appendix C.2. If 
this procedure is saved as the text file to-number in the directory from which 
we are running Maple, then we can include the to-number procedure in 
this Maple session by entering the following command. 


> read to_number; 


We can then convert message into its numerical equivalent as a single block 
by entering the following command. 


> plaintext := to_number (message); 


plaintext := 1704192017131914070400031620001719041718 
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Because this plaintext integer is smaller than n, we can encipher this mes- 
sage as a single block. That is, we can encipher this message by raising 
plaintext to the power a and reducing modulo n. To do this, we enter the 
following command. (Because this modular exponentiation involves such a 
large exponent, we use the Maple &^ command instead of just ^ for the 
exponentiation. By using &^, we cause Maple to do the exponentiation in 
a very efficient way, like the technique discussed in Section 7.4.) 


> ciphertext := plaintext &^ a mod n; 


ciphertext := 39705667751051336812284136334817473485289 


To decipher this message, we must find a decryption exponent b that sat- 
isfies ab = 1modm. We can do this by entering the following Maple 
igcdex command. Like the preceding igcd command, the following igcdex 
command returns the greatest common divisor of the integers a and m. 
However, the following igcdex command also takes two additional user- 
defined variable inputs, which it leaves as integers b and y that satisfy 
ab + my = (a,m). Since (a,m) = 1, these will be values of b and y that 
satisfy ab + my = 1 or, equivalently, ab = 1 mod m. Thus, we can find a 
decryption exponent b by entering the following command. 
> igcdex(a, m, ’b’, ’y’); 


1 


To see the decryption exponent b defined by the previous command, and to 
express this value as a positive number less than m, we enter the following 
command. 


> b := b mod m; 


b := 54299300950841826990071853678997985400035 


Next, by entering the following command we verify that this value of b 
satisfies ab = 1 mod m. 


> a*b mod m; 
1 


To recover the plaintext integer, we must only raise ciphertext to the 
power b and reduce modulo n. 


> plaintext := ciphertext &^ b mod n; 


plaintext := 1704192017131914070400031620001719041718 


To see the original plaintext letters, we must split this single block into a 
list of 2-digit integers and convert these integers back into letters. To do 
this, we have provided the user-written procedure to_letter, for which code 
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is given in Appendix C.2. If this procedure is saved as the text file to_letter 
in the directory from which we are running Maple, then we can include 
the to_letter procedure in this Maple session by entering the following 
command. 


> read to_letter; 


We can then convert plaintext back into a list of letters by entering the 
following command. 


> to_letter (plaintext); 


returntoheadquarters 


A final command we will mention is the ifactor command, which re- 
turns the prime factorization of an integer input. For example, the following 
command very quickly returns the prime factorization of the 43-digit integer 
1118516508138307725195354324934560155358253. 


> ifactor (1118516508138307725195354324934560155358253) ; 


(17)° (389 ) (45001200019828331 )? 


Recall that the security of the RSA cryptosystem is based on the apparent 
difficulty of factoring the value of n. Hence, in order for the RSA system 
used in this section to be secure, it should be very difficult for an intruder 
to factor the 42-digit value of n used in this section. Although this value 
of n is one digit shorter than the integer used in the preceding command, 
because the prime factors of n are both very large, it will take ifactor much 
more time to return these prime factors. For example, the reader may wish 
to enter the preceding and following commands to see the difference. (Make 
sure you know how to interrupt the following command before entering it.) 


> ifactor (2000336999557 14283345172521584008468989639) ; 


And recall, as mentioned in Section 7.2, if p and q are both around 100 
digits long, then the fastest known factoring algorithms, including the one 
employed by the ifactor command, would in general take millions of years 
to factor n = pq, even when programmed on a computer that can perform 
millions of operations per second. 


7.4 A Note on Modular Exponentiation 


Securely enciphering and deciphering messages using the RSA cryptosys- 
tem generally requires modular exponentiation with extremely large bases 
and exponents. For example, to decipher the message in Section 7.3, 
we had to raise the number 3970566775 1051336812284136334817473485289 
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to the power 54299300950841826990071853678997985400035 and reduce 
the result modulo 200033699955714283345172521584008468989639. Even 
using the world’s fastest computer, performing this computation by ac- 
tually multiplying 39705667751051336812284136334817473485289 by itself 
54299300950841826990071853678997985400034 times would take essen- 
tially an infinite amount of time. In this section we show a technique that 
can be used to perform this modular exponentiation in a very efficient way. 


For convenience, we will illustrate this technique for efficient modular 
exponentiation in the calculation 


91518013?0°81287 — 13021820 mod 363794227 (7.3) 


that deciphered the message in Example 7.4. This modular exponentia- 
tion can be done in a much more efficient way than multiplying 91518013 
by itself 20981286 times. To do this computation more efficiently, 
we begin by computing the values of (91518013)? mod 363794227 for 
i=1,...,24. That is, for P = 91518013 and M = 363794227, we compute 
P2, Pt, P8, Pt6,..., P2"" and reduce each modulo M. Note that each 
P?” mod M can be found by squaring P?’ mod M. Thus, finding these 
values requires 24 total multiplications. The modular exponentiation in 
(7.3) can then be completed by computing 


(P29981287 mod M 


=  P16777216+4194304+8192+1024+512+32+44+241 mod M 
24 | 922 191315101 99 1595192151190 
PA +2 +2 42 42° +2°42°4+2°4+2° mod M 
24 22 18 10 9 5 2 1 0 
P? P? P? O P? P? o. pP. pP. pP. pP? mod M 


II 


which requires only 8 additional multiplications. Hence, this technique can 
be used to perform the modular exponentiation in (7.3) with only 32 multi- 
plications. This is, of course, much fewer than the 20981286 multiplications 
necessary to multiply P by itself 20981286 times. 


It is not difficult to see that this technique for efficiently computing 
P* mod M requires at most 2 log, a multiplications (see Written Exercise 
6). Hence, to perform even the massive modular exponentiation mentioned 
at the beginning of this section, this technique would require at most only 


2 logs 54299300950841826990071853678997985400035 ~ 270 


multiplications. 
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7.5 A Note on Primality Testing 


Recall that to construct a secure RSA cryptosystem, the primes p and q 
chosen for the system must be very large. For example, we mentioned in 
Section 7.2 that if p and q are both around 100 digits long, it would in 
general take an intruder millions of years to break the system, even using 
a computer that can perform millions of operations per second. However, 
constructing an RSA system with such large primes is not particularly easy, 
for it is not particularly easy to find such large primes. In fact, motivated in 
part by the development of public-key cryptosystems like the RSA system, 
much research has been done recently in the area of primality testing. 


Contrary to what the general name primality test suggests, a primality 
test is a criterion that can be used to determine if a specific number is not 
prime. The conclusions that can be drawn from applying a primality test to 
a number n are that either n “fails” the test and is definitely not prime, or 
that n “passes” the test and is probably prime (with probability depending 
on the “power” of the test). 


The most direct and conclusive way to determine if a large odd integer 
n is prime is to try to find nontrivial factors of n by trial and error. We can 
do this systematically by checking to see if m|n as m runs through the odd 
integers starting with m = 3 and stopping when m reaches y/n. While this 
would reveal with certainty whether n was prime or composite, it would 
require many more divisions than could reasonably be done if n was of any 
significant size. In the remainder of this section we briefly discuss a very 
well-known and simple primality test based on Fermat’s Little Theorem 
(Theorem 7.4). 


If n is prime, then as a consequence of Fermat’s Little Theorem, 
a”! = lmodn (7.4) 


for all a € Z*. Hence, it follows that if a”! # 1 mod n for any a € Z*, 
then n is not prime. Thus, we can test the primality of an integer n by 
checking to see if (7.4) holds for certain values of a in Zž. While this test 
is very easy to perform, there are some values of a for which (7.4) holds 
even when (a,n) = 1 and n is composite. In such cases, n is called a 
pseudoprime to the base a. For example, 2°4° = 1 mod 341 even though 
341 is not prime. Thus, 341 is a pseudoprime to the base 2. However, since 
3340 — 56 mod 341, then 341 is not a pseudoprime to the base 3. 


Pseudoprimes are scarce relative to the primes. For example, there are 
only 245 pseudoprimes to the base 2 less than one million, while there are 
78498 primes less than one million. Also, most pseudoprimes to the base 2 
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are not pseudoprimes to many other bases. However, there do exist com- 
posite integers n that are pseudoprime to every base a < n with (a,n) = 1. 
Such numbers are called Carmichael numbers. There are 2163 Carmichael 
numbers less than 2.5 x 101°. The smallest Carmichael number is 561. 


There are many primality tests that are more definitive in their con- 
clusions than the test described above. For example, a further primality 
test based on Fermat’s Little Theorem that is also very easy to perform 
fails only for an extremely small number of composites called strong pseu- 
doprimes. In fact, there is only one strong pseudoprime to the bases 2, 3, 
5, and 7 less than 2.5 x 101°. There is no strong pseudoprime analogue to 
Carmichael numbers. 


7.6 A Note on Integer Factorization 


Recall that the security of the RSA cryptosystem is based on the apparent 
difficulty of factoring a number that is the product of two very large distinct 
primes. As in the area of primality testing, the development of public-key 
cryptosystems like the RSA system has motivated much research in the 
area of integer factorization. In this section we briefly discuss a very simple 
technique for integer factorization called Fermat factorization. Despite the 
fact that this factorization technique is quite old, Fermat factorization is 
still a very useful technique for factoring numbers that are the product of 
two very large distinct primes that are relatively close together. 


Let n = pq be the product of two very large distinct primes, and 
suppose we wish to determine the values of p and q from the knowledge 
of n. The most direct way to find p and q would be by trial and error. 
However, this would certainly not be feasible if both p and q were of any 
significant size. But if p and q were relatively close together, then even if 
they were very large we could determine them as follows. Let x = (p+q)/2 
and y = (p — q)/2. Then n = pq = 2? — y? = (x + y)(a — y). And since n 
has prime factors p and q, then p and q must be equal to «+ y and x — y. 
Hence, to determine p and q, we must only find the values of x and y. To 
find x and y, we begin by assuming that x is the smallest integer larger 
than yn. Since n = x? — y?, if we have assumed the correct value of x, 
then x? — n will be the perfect square y?. If x? — n is not a perfect square, 
then we have assumed an incorrect value for x, and we increase x by one 
and repeat. We continue to repeat this process, each time increasing x by 
one, until x? —n is a perfect square. Note that if p and q are relatively close 
together, then the number of times this process must be repeated should 
be relatively small. 
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Example 7.5 Suppose that we wish to find the two prime factors of 
n = pq = 64349. Since the smallest integer larger than 64349 is 254, 
we begin by letting x = 254. But since 2542 — n = 167 is not a perfect 
square, then 254 is not the correct value for x. Next, we try x = 255. Since 
2552 — n = 676 = 267 is a perfect square, then the correct values of x and 
y are x = 255 and y = 26. Thus, the prime factors of n are x + y = 281 
and x — y = 229. E 


In comparison of the problems of primality testing and integer factor- 
ization, we should mention that factoring a known composite is in general 
significantly more time-consuming than finding a prime of approximately 
the same size. We have stated several times that the security of the RSA 
cryptosystem is based on the apparent difficulty of factoring a number that 
is the product of two very large distinct primes. To be more precise, the 
security of the RSA cryptosystem is based on the fact that it is apparently 
much more time-consuming for an intruder to factor the publicly known 
value of n = pq than for the intended recipient of the message to choose 
p and q. (We use the word “apparently” because it has never been con- 
clusively proven that factorization is significantly more time-consuming. 
Evidence, however, strongly suggests this.) 


7.7 A Note on Digital Signatures 


When the idea of public-key cryptography was developed, one way in which 
it was envisioned that it could be used was as follows. Suppose a group of 
people all wish to be able to communicate spontaneously with each other 
across a series of insecure lines of communication. For illustration, suppose 
they wish to use the RSA system to encipher their messages. To use the 
RSA system most effectively, each person in the group could choose their 
own secret primes p and q, form their own personal value for n = pq, and 
choose their own personal encryption exponent a. Each person in the group 
could then make their values of n and a public knowledge. Then, whenever 
a person in the group wanted to send another person in the group a secret 
message, they could use the intended recipient’s public values of n and a to 
encipher the message. That way, only the intended recipient would be able 
to decipher the message. However, this leads to a problem for the intended 
recipient of the message, for the intended recipient would have no way to 
verify that the received message was sent by the person claiming to have 
sent it. This problem can be avoided as follows. 


Suppose we wish to send the secret message P to a colleague across an 
insecure line of communication using RSA. Assume we have made public 
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our personal RSA modulus nı and encryption exponent a, while keeping 
our decryption exponent bı secret, and our colleague has made public his or 
her personal RSA modulus nz and encryption exponent a> while keeping 
his or her decryption exponent bz secret. Suppose also that ny < nə. 
To encipher our message, instead of applying our colleague’s encryption 
exponent and modulus directly to the plaintext, we first apply our own 
decryption exponent and modulus. That is, instead of sending our colleague 
the ciphertext P®2 mod ng, we first compute P, = P®! mod nı, and then 
send our colleague the ciphertext Cy = PP? mod ng. Our colleague can 
easily decipher this message by first applying his or her decryption exponent 
and modulus to obtain P) = Ce mod nz, and then applying our publicly 
known encryption exponent and modulus to obtain P = Př mod nı. Since 
the decryption exponent bı we used in enciphering the message is known 
only to us, our colleague would know that only we could have enciphered the 
message. Because it has the effect of authenticating the message, applying 
our own decryption exponent and modulus in the encipherment of a message 
is sometimes called signing the message. For the case when nı > ne, see 
Written Exercise 9. 


Authentication of messages has been a very important and highly stud- 
ied branch of cryptography for many years. In fact, it is interesting to note 
that in the title of their classic paper, “A Method for Obtaining Digital 
Signatures and Public-Key Cryptosystems”, in which Rivest, Shamir, and 
Adleman introduced the RSA system, the notion of a digital signature was 
given precedence over that of a public-key cryptosystem. 


7.8 The Diffie-Hellman Key Exchange 


Recall that two people wishing to use the RSA cryptosystem to exchange 
a secret message across an insecure line of communication make their en- 
cryption exponent public knowledge. In this section we discuss a technique 
that can be used by two people to keep an RSA encryption exponent secret 
while communicating only across an insecure line of communication.! 


There are several techniques by which two people can agree upon a 
cryptographic key secretly without having a secure way to communicate. 
One technique is the Diffie-Hellman key exchange, a process presented by 
W. Diffie and M. Hellman in their classic paper, “New Directions in Cryp- 
tography”, in which they introduced the idea of public-key cryptography. 
In order to describe a way of incorporating this key exchange system with 


1Copyright 1999 by COMAP, Inc. This material appeared in the spring 1999 issue of 
UMAP (see [10}). 
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the RSA system, suppose we wish to receive a secret message from a col- 
league using RSA. Furthermore, suppose we and our colleague would like 
to agree upon our encryption exponent secretly while communicating only 
across an insecure line of communication. We can accomplish this by the 
following steps in the Diffie-Hellman key exchange. 


1. We choose primes p and q, form n = pq, and choose a positive integer 
k < n with (k,n) = 1. We then send the values of k and n to our 
colleague across the insecure line of communication. 


2. We choose a positive integer r < n, compute k” mod n, and send 
the result to our colleague while keeping r secret. Meanwhile, our 
colleague chooses a positive integer s < n, computes kê mod n, and 
sends the result to us while keeping s secret. 


3. Both we and our colleague form the candidate encryption exponent 
a = k”? mod n, which we compute as (k*)" mod n, and our colleague 
computes as (k”)* mod n. Since we know p and q, we can form 
m = (p — 1)(q — 1) and determine if a is a valid RSA encryption 
exponent by determining if (a,m) = 1. If a is not a valid RSA 
encryption exponent, we repeat the process. 


After we obtain a valid RSA encryption exponent, our colleague can then 
encipher his or her message using the usual RSA encipherment procedure 
with encryption exponent a and modulus n. 


Example 7.6 Suppose we choose primes p = 83 and q = 101 so 
that n = 8383 and m = 8200. Suppose also that we choose k = 256, 
and send & and n to our colleague. We then choose r = 91, compute 
256°! mod 8383 = 2908, and send the result to our colleague while 
keeping r secret. Meanwhile, our colleague chooses s = 4882, computes 
256485? mod 8383 = 1754, and sends the result to us while keeping s secret. 
Both we and our colleague then form the candidate encryption exponent 
a = 6584, which we compute as 17549! mod 8383 = 6584, and our colleague 
computes as 29084882 mod 8383 = 6584. However, because this value of a 
is not a valid RSA encryption exponent (6584 is not relatively prime to m), 
we would inform our colleague that we must repeat the process. For the 
second attempt, suppose we choose the same values for p, g, and k. We 
then choose r = 17, compute 256!” mod 8383 = 5835, and send the result 
to our colleague. Meanwhile, our colleague chooses s = 109, computes 
256/99 mod 8383 = 1438, and sends the result to us. Both we and our 
colleague then form the candidate encryption exponent a = 3439, which 
we compute as 1438!" mod 8383 = 3439, and our colleague computes as 
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5835109 mod 8383 = 3439. Since this value of a is a valid RSA encryption 
exponent, we would confirm to our colleague that he or she could proceed 
with the usual RSA encipherment procedure. E 


Note that in this key exchange system, we must assume the values of 
k, n, k” mod n, and k* mod n are known to intruders since they were all 
transmitted across an insecure line of communication. Hence, in order for 
this key exchange system to be secure, it should be an essentially impossible 
problem for an intruder to determine k”* mod n from the knowledge of k, 
n, k” mod n, and kê mod n. This problem is called the Diffie-Hellman 
problem. It has been conjectured that the only way to solve the Diffie- 
Hellman problem in general is to solve the discrete logarithm problem. We 
discuss this problem next. 


Discrete logarithms are important to consider when studying the Diffie- 
Hellman key exchange because the solution to a particular discrete log- 
arithm problem leads directly to the solution to a corresponding Diffie- 
Hellman problem. Suppose we intercept transmissions between our enemy 
as they perform a Diffie-Hellman key exchange. That is, using the vari- 
ables defined previously, suppose we intercept values of k, n, k” mod n, 
and kê mod n. We now consider the problem of determining r from the 
knowledge of k, n, and k” mod n. In this scenario, r is called a discrete log- 
arithm of k” mod n to the base k, and the problem of determining r from 
the knowledge of k, n, and k” mod n is called the discrete logarithm problem. 
Note that if we could solve this general discrete logarithm problem, then the 
preceding general Diffie-Hellman problem would also be solved, for we could 
determine r from k” mod n, and then compute a = (k*)" mod n. However, 
solving the discrete logarithm problem is not an easy method for solving the 
Diffie-Hellman problem, for it can be argued that the best (fastest) way to 
solve the discrete logarithm problem with a composite modulus n involves 
first factoring n. Thus, the factorization problem that provides security to 
the RSA system also provides security to the Diffie-Hellman key exchange 
(as it has been presented in this section). 


Many algorithms for computing discrete logarithms have been pre- 
sented in literature. For small values of n, and some special large values 
of n (for example, powers of a small base), many mathematics software 
packages have pre-defined functions for directly computing discrete loga- 
rithms. The Maple function for computing discrete logarithms is the mlog 
function, which is part of the numtheory number theory package. If we 
enter the following commands in Maple 


> with(numtheory): 


> mlog(y, k, n); 
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for positive integers y, k, and n, Maple will return an integer r with the 
property that y = k” mod n. (If no such integer exists, Maple will return 
false.) For example, the following command 

> mlog(1438, 256, 8383); 


109 


indicates that 256199 mod 8383 = 1438, a fact we used in Example 7.6. 


Using mlog and provided n is small, an intruder who intercepts 
Diffie-Hellman key exchange transmissions could easily determine the 
resulting cryptographic key. For example, suppose an intruder intercepts 
the second set of transmissions k = 256, n = 8383, k” mod n = 5835, 
and kê modn = 1438 from Example 7.6. The intruder could deter- 
mine the resulting RSA encryption exponent a by using mlog as 
above to find that s = 109 satisfies kê mod n = 1438, and then computing 
a = 5835199 mod 8383 = 3439. 


In addition to the fact that mlog will in general run essentially for- 
ever for very large values of n (as will all known algorithms for computing 
discrete logarithms), there is another problem with using mlog to “undo” 
the operation of modular exponentiation. While it is true that entering 
the preceding general mlog command will cause Maple to return an in- 
teger r with the property that y = k” mod n, this integer will not nec- 
essarily be the integer actually used in the modular exponentiation being 
undone. For example, in the first part of Example 7.6, we used the fact 
that 2564882 mod 8383 = 1754. But the following command 

> mlog(1754, 256, 8383); 


782 


indicates that also 25678? mod 8383 = 1754. Hence, the number returned 
by Maple is not the exponent we used in the example. However, this would 
not pose a problem for an intruder who intercepts the first set of trans- 
missions from Example 7.6, for the intruder could still find the resulting 
candidate encryption exponent a by computing 290878? mod 8383 = 6584. 
Thus, despite the fact that Maple returned an unexpected result, this re- 
sult can still be used in the intruder’s general procedure for determining 
the candidate encryption exponent. To see that this will be true in gen- 
eral, suppose an intruder uses the mlog command to try to find the ex- 
ponent in a Diffie-Hellman key exchange transmission kê mod n. Even if 
the number returned by Maple is s # s, since it will be the case that 
k® mod n = k” mod n, the intruder can still find the candidate encryption 
exponent a in the way illustrated above since it will also be the case that 
a = (k*)" mod n = (k) mod n. 


© 1999 by CRC Press LLC 


Written Exercises 


1. Consider the message, “ATTACK RIGHT FLANK”. 


(a) Encipher this message using the RSA cryptosystem with primes 
p = 11 and q = 23 and encryption exponent a = 7. Use the 
correspondence a@ from Chapter 6 to convert the message into 
numerical form, and encipher each plaintext character separately 
as in Example 7.2. 


(b) Use the Euclidean algorithm to find the decryption exponent 
that corresponds to the encryption exponent in part (a). 


(c) Encipher this message using the RSA cryptosystem with primes 
p = 83 and q = 131 and encryption exponent a = 3. Use the 
correspondence a@ from Chapter 6 to convert the message into 
numerical form, and group the plaintext integers into blocks with 
four digits as in Example 7.3 before enciphering. 


2. Suppose you wish to be able to receive messages from a colleague 
using the RSA cryptosystem. You begin the process by choosing 
primes p = 17 and q = 29 and encryption exponent a = 153. Verify 
that this value of a is a valid RSA encryption exponent, and use the 
Euclidean algorithm to find the corresponding decryption exponent. 


3. Suppose your enemy is exchanging messages using the RSA cryptosys- 
tem, and you intercept their modulus n = 33, encryption exponent 
a = 7, and the following ciphertext: 27 8 20 29 16 16 9 13 20 
13 0 8 30 16 13. Decipher this message. (The correspondence a 
from Chapter 6 was used to convert the message into numerical form, 
and each plaintext character was enciphered separately as in Example 
7.2.) 


4. Suppose you wish to be able to receive messages from a colleague using 
the RSA cryptosystem. You begin the process by choosing primes 
p = 47 and q = 59 and encryption exponent a = 1779. Suppose you 
also determine the corresponding decryption exponent b = 3, and 
you receive the following ciphertext from your colleague: 0792 2016 
0709 0464 1497 1086 2366 0524. Decipher this message. (The 
correspondence a from Chapter 6 was used to convert the message 
into numerical form, and the plaintext integers were grouped into 
blocks with four digits as in Example 7.3 before being enciphered.) 
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10. 


11. 
12. 
13. 


Consider the modular exponentiation 
1302182018783 = 91518013 mod 363794227 


that enciphered the message in Example 7.4. Find the exact number 
of multiplications the technique for efficient modular exponentiation 
illustrated in Section 7.4 requires to perform this calculation. 


Show that the technique for efficiently computing P® mod M 
illustrated in Section 7.4 requires in general at most 2 loga 
multiplications. 


Show that 15 is a pseudoprime to the base 4 but not a pseudoprime 
to the base 3. 


Use Fermat factorization to find the two prime factors of the integer 
n = pq = 321179. 


Suppose you wish to send the secret message P to a colleague across 
an insecure line of communication using the RSA cryptosystem. As- 
sume you have made public your personal RSA modulus nı and en- 
cryption exponent a; while keeping your decryption exponent bı se- 
cret, and your colleague has made public his or her personal RSA 
modulus nz and encryption exponent az while keeping his or her de- 
cryption exponent bz secret. Suppose also that nı > nə. 


(a) Explain how the method described in Section 7.7 for digitally 
signing your message could fail. 

(b) Devise a method similar to the one described in Section 7.7 for 
digitally signing your message that could not fail. 


Using primes p = 5 and q = 7, act as both people in the Diffie- 
Hellman key exchange system and agree upon a valid RSA encryption 
exponent a. List the results from all trials of the key exchange process, 
including trials that do not result in a valid encryption exponent. 


Show that the set U» of units in Zn forms a multiplicative group. 
Prove Theorem 7.1. 


Prove Corollary 7.3. 
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Maple Exercises 


1. Consider the message, “GO PACK”. 


(a) Encipher this message using the RSA cryptosystem with 3-digit 
primes p and q and a valid 2-digit encryption exponent a of your 
choice. Use the correspondence a from Chapter 6 to convert the 
message into numerical form, and group the plaintext integers 
into blocks with four digits as in Example 7.3 before enciphering. 


(b) Encipher this message using the RSA cryptosystem with 4-digit 
primes p and q and a valid 3-digit encryption exponent a of your 
choice. Use the correspondence a from Chapter 6 to convert the 
message into numerical form, and group the plaintext integers 
into blocks with six digits before enciphering. 

(c) Encipher this message using the RSA cryptosystem with 7-digit 
primes p and q and a valid 4-digit encryption exponent a of your 
choice. Use the correspondence a from Chapter 6 to convert the 
message into numerical form, and group the plaintext integers 
into a single block as in Example 7.4 before enciphering. 


2. Suppose your enemy is exchanging messages using the RSA cryp- 
tosystem, and you intercept their modulus n = 86722637, encryption 
exponent a = 679, and the following ciphertext: 35747828 20827476 
55134021 85009695. Decipher this message. (The correspondence a 
from Chapter 6 was used to convert the message into numerical form, 
and the plaintext integers were grouped into blocks with six digits 
before being enciphered.) 


3. Set up a parameterization of the RSA cryptosystem using primes p 
and q with at least 30 digits each. Choose a valid encryption exponent 
a, and determine a corresponding decryption exponent b. Then use 
this parameterization of the RSA system to encipher and decipher 
the message, “CANCEL MISSION WAIT FOR NEW ORDERS”. (Use 
the correspondence a from Chapter 6 to convert the message into 
numerical form.) 


4. Using a Maple for or while loop, find the smallest base to which the 
number 3215031751 is not a pseudoprime. 


5. Using primes p = 503 and q = 751, act as both people in the Diffie- 
Hellman key exchange system and agree upon a valid RSA encryption 
exponent a. List the results from all trials of the key exchange process, 
including trials that do not result in a valid encryption exponent. 
Also, show how an intruder could use Maple to find the value of a. 
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Chapter 8 


Elliptic Curve 
Cryptography 


Recall from Section 7.8 that the security of the Diffie-Hellman key exchange 
system is based on the difficulty of solving the discrete logarithm problem. 
In this chapter we discuss a public-key cryptosystem whose security is also 
based on the difficulty of solving the discrete logarithm problem. This sys- 
tem, named the ElGamal cryptosystem for T. ElGamal who first published 
the system in 1985, has formed an important area of recent cryptographic 
research due to how elliptic curves can naturally be incorporated into the 
system. 


8.1 The ElGamal Cryptosystem 


Before discussing elliptic curves and how they can naturally be incorporated 
into the ElGamal system, we first describe the system in general and give 
two simple examples of it. In order to describe the ElGamal system, suppose 
two people wish to exchange a secret message across an insecure line of 
communication. They can accomplish this by the following steps in the 
ElGamal cryptosystem: 


1. As with the RSA cryptosystem, the intended recipient of the message 
initiates the process. The intended recipient chooses a finite abelian 
group G and an element a € G, then chooses a positive integer n, 
computes b = a” in G, and makes the group G and the values of a 
and b public knowledge. 
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2. Using some public method of conversion, the originator of the mes- 
sage converts his or her message into an equivalent element or list of 
elements in G. Suppose the message converts to the element w € G. 
The originator of the message then chooses a positive integer k, com- 
putes y = aë and z = wb* in G, and sends the values of y and z to 
the intended recipient across the insecure line of communication. 


3. Because the intended recipient of the message knows n, the intended 
recipient can recover w by computing zy~” in G since 


zy” = wb*(a")-” = wba ")* = w(1)* = w. 


Note: If |a| = m, then y~” can be determined as y™~”. 
y y 


Although the preceding steps are specific to the ElGamal cryptosystem, 
the system can appear in many different forms due to the various types of 
groups that can be used for G. This is precisely how we will incorporate 
elliptic curves into the system. We will show in Section 8.3 that elliptic 
curves over finite fields form abelian groups with a specially-defined opera- 
tion. Of course, it is not necessary to use this type of group in the system. 
The ElGamal system is especially easy to implement if G is chosen to be a 
group like the multiplicative group Z% for prime p. 


Example 8.1 Suppose we wish to use the ElGamal cryptosystem to send 
the message, “NCSU” to a colleague across an insecure line of communi- 
cation. 


1. Our colleague begins the process by choosing G = Z% for prime 
p = 100000007. Next, our colleague chooses a = 180989 and 
n = 5124541, computes b = a” mod p = 10524524, and sends the 
values of p, a, and 6 to us. 


| 


2. Suppose we use the correspondence a from Chapter 6 to convert our 
message into a single block numerical equivalent. That is, suppose 
we convert our message into the numerical equivalent w = 13021820. 
We then choose k = 3638997, compute y = a* mod p = 73133845 and 
z = wb? mod p = 83973114, and send the values of y and z to our 
colleague. 


3. Our colleague can easily verify that the polynomial x — 180989 
is primitive in Z,[z]. Hence, the order of a = 180989 in Z% 
is p — 1. Thus, our colleague can recover w by computing 
zy” mod p = zy-)-” mod p = 13021820. 

a 
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Note that in Example 8.1 we would have to assume that the values of 
p, a, b, y, and z were all public knowledge since they were all transmitted 
across an insecure line of communication. And for the system to be secure, 
an intruder must not be able to determine zy~” mod p. Hence, an intruder 
must not be able to determine the value of n from intercepted values of p, 
a, and b = a” mod p. But this is precisely the statement of the discrete 
logarithm problem we discussed in Section 7.8 with a prime modulus. That 
is, the security of the ElGamal system in Example 8.1 is based on an in- 
truder not being able to solve the discrete logarithm problem we discussed 
in Section 7.8 with a prime modulus. We mentioned in Section 7.8 that the 
discrete logarithm problem with a large composite modulus is in general 
very difficult to solve. This is true, of course, with a large prime modulus 
as well. 


Discrete logarithms and the discrete logarithm problem can be defined 
much more generally than how we defined them in Section 7.8. More gen- 
erally, for any element x in a finite group G and an element y € G that is a 
power of x, any integer r that satisfies x” = y is called a “discrete logarithm 
of y to the base x”, and the problem of determining an integer r that satis- 
fies x” = y is called the “discrete logarithm problem”. As we mentioned in 
Section 7.8, many algorithms for computing discrete logarithms have been 
presented in literature. However, in groups with extremely large order, even 
the fastest known discrete logarithm algorithms are in general extremely 
time-consuming. For example, the fastest known discrete logarithm al- 
gorithms would take millions of years to compute discrete logarithms in 
groups with approximately 102° elements. 


As we mentioned above, the ElGamal cryptosystem can appear in 
many different forms due to the various types of groups that can be used for 
G. Recall that the group Z% used in Example 8.1 is the group of nonzero 
elements in the finite field Zp. We close this section with an example of 
the ElGamal system using the group of nonzero elements in a more general 
finite field. 


Example 8.2 Suppose we wish to use the ElGamal cryptosystem to send 
a secret message to a colleague across an insecure line of communication. 


1. Our colleague begins the process by choosing the primitive polynomial 
p(x) = x? +4242 € Z5[x]. Then for the finite field F = Z5[2]/(p(x)) 
with 5° = 3125 elements, our colleague lets G be the multiplicative 
group F*. Next, our colleague chooses a = x and n = 1005, computes 
b= a” = 2244423 +474 4242 in G, and sends p(x), a, and b to us. 


2. Using some public method of conversion, we convert our message into 
the field element w = zt + z3 +3 € G. We then choose k = 537, 
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compute y = a = 224+23+42+44 and z = wb? = 244323+227+3 
in G, and send y and z to our colleague. 


3. Since p(x) is primitive and a = z, then |a| = |F*| = 3124. Hence, our 
colleague can recover w by computing zy~" = zy?!24-" = ¢4+423+3. 
a 


Note that an intruder could break the ElGamal cryptosystem in Exam- 
ple 8.2 by solving the discrete logarithm problem in the group F*. Specifi- 
cally, an intruder could break the system by finding a discrete logarithm of 
b = 2a4 + 4x3 + x? +42 +2 € F* to the base a = z. 


8.2 The ElGamal Cryptosystem with Maple 


In this section we show how Maple can be used to perform the computations 
in Examples 8.1 and 8.2. 


In Example 8.1 our colleague began the process by choosing G = Z% 
with prime p = 100000007. The following command defines this value of p 
and shows that it is prime. 

> p := nextprime (100000000) ; 


p := 100000007 


Next, our colleague chose the following values for a and n. 
> a := 180989: 
> n := 5124541: 


Our colleague then formed b = a” mod p. Recall that this computation can 
be done in an efficient way by using the Maple &^ command as follows. 
> b := a & n mod p; 


b := 10524524 


Our colleague then sent the values of p, a, and b to us. We converted our 
message into the following numerical equivalent w and chose the following 
value for k. 

> w := 13021820: 

> k := 3638997: 


Next, we computed y = aë mod p and z = wb! mod p. 
> y := a & k mod p; 


y := 73133845 
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> z := w*(b &^ k) mod p; 
z := 83973114 
We then sent the values of y and z to our colleague. Recall that the poly- 


nomial x — 180989 is primitive in Z,[z], and thus the order of a = 180989 
in Z5 is p—1. The following command verifies this. 


> Primitive(x-a) mod p; 


true 


Finally, our colleague recovered w by computing zy—-" mod p. 


> z*(y & (p-i-n)) mod p; 
13021820 


In Example 8.2 our colleague began the process by choosing the primi- 
tive polynomial p(x) = 2° +4r +2 € Z5[z]. The following commands define 
p(x) and show that it is primitive in Z5[a]. 

> p := x -> x75 + 4*x + 2: 


> Primitive(p(x)) mod 5; 


true 
For the finite field F = Zs[x]/(p(x)) of order 5° = 3125, our colleague then 
let G = F*. Next, our colleague chose a = x and the following value for n. 
E SS 
> n := 1005: 
Our colleague then formed b = a” in G. Recall that this computation can 
be done by using the Maple Powmod command as follows. 


> b := Powmod(a, n, p(x), x) mod 5; 
b:= 2044402 +r? +4242 


Our colleague then sent p(x), a, and b to us. We converted our message 
into the following field element w and chose the following value for k. 


> w := x74 + x73 + 3: 


> k := 537: 
Next, we computed y = af in G. 


> y := Powmod(a, k, p(x), x) mod 5; 


y:=2ett+e3 +4044 
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We then computed z = wb! in G. To do this, we can enter the following 
Powmod and Rem commands. 
> bk := Powmod(b, k, p(x), x) mod 5; 


bk := 3at+2e74+3¢4+1 
> z := Rem(w*bk, p(x), x) mod 5; 
z := 044302422743 


We then sent y and z to our colleague. Recall that since p(x) is primitive 
and a = 2, then |a| = 3124. Hence, our colleague recovered w by computing 
zy>24-" e G. We can perform this computation by entering the following 
Powmod and Rem commands. 

> yn := Powmod(y, 3124-n, p(x), x) mod 5; 


yn := zf +423 +40? +4242 
> Rem(z*yn, p(x), x) mod 5; 


rt +r? +3 


8.3 Elliptic Curves 


Elliptic curves have figured prominently in several types of mathematical 
problems. For example, the recent proof of Fermat’s Last Theorem by An- 
drew Wiles employed elliptic curves. Elliptic curves have also played an im- 
portant role in integer factorization, primality testing, and, more recently, 
public-key cryptography. The idea of using elliptic curves in public-key 
cryptography was first proposed by N. Koblitz and V. Miller in 1985. 


Let F be a field not of characteristic 2 or 3, and suppose c,d € F 
such that x? + cx + d has no multiple roots or, equivalently, such that 
4c3 + 27d? # 0. Then the set of ordered pairs (x,y) € F x F of solutions 
to the equation 

yY = wter+d (8.1) 


together with a special element denoted by O and called the point at in- 
finity is called an elliptic curve. The significance of the element O will be 
described below. An elliptic curve, when endowed with a specially-defined 
operation, forms an abelian group. This operation is initially best viewed 
geometrically when applied to an elliptic curve over the reals. For example, 
consider the following graph of the ordered pairs (x,y) of solutions to the 
equation y? = x? — 6x over the reals. 


© 1999 by CRC Press LLC 


Note first that, as we would expect from the form of (8.1), this graph 
is symmetric about the z-axis. We now describe the operation that, when 
applied to the points on this graph and point at infinity O, gives this elliptic 
curve E the structure of an abelian group. This operation is an addition 


operation and can be summarized as follows. 


1. The point at infinity serves as the identity in the group. Thus, 
definition, P + O = O + P = P forall P € E. 


2. For any point P = (x,y) on the graph of y? = z? — 6x, we define 


by 


the 


negative of P to be —P = (a, —y). This is illustrated in the following 


graph. 


3. Suppose that P and Q are on the graph of y? = x3— 6x with P 4 4 


EQ, 


and that the line connecting P and Q is not tangent to the graph at 
P or Q. Then it is not difficult to show that the line connecting P 
and Q will intersect the graph at a unique third point R. We then 


define P + Q = —R. This is illustrated in the following graph. 
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4. Suppose that P and Q are on the graph of y? = x? —6x with P 4 +Q, 
and that the line connecting P and Q is tangent to the graph at P. 
We then define P+ Q = —P. This is illustrated in the following 
graph. 


5. Suppose that P is on the graph of y? = x? — 6x with x Æ 0, and that 
P is not a point of inflection for the graph. Then it is not difficult to 
show that the line tangent to the graph at P will intersect the graph 
at a unique second point R. We then define P+ P = —R. This is 
illustrated in the following graph. 


6. Suppose that P is on the graph of y? = x? — 6x, and that P is a 
point of inflection for the graph. We then define P+ P = —P. This 
is illustrated in the following graph. 
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This operation is clearly commutative. The fact that this operation is 
associative is less obvious, and will be assumed. 


Recall that for the ElGamal cryptosystem we need a finite abelian 
group. Hence, elliptic curves over the reals like the one illustrated above 
do not form groups that we could use in the ElGamal system. However, for 
an elliptic curve in which the underlying field F is finite, the elliptic curve 
will also be finite. For example, consider an elliptic curve in which the 
underlying field is Z, for prime p > 3. Although the operation described 
above geometrically is applied specifically to an elliptic curve over the reals, 
this general operation gives any elliptic curve the structure of an abelian 
group. Of course, for an elliptic curve over Zp, this operation cannot be 
described in the same way geometrically. However, the operation can be 
expressed algebraically. 


Let p be a prime with p > 3, and suppose c,d € Zp such that x?+cxr+d 
has no multiple roots or, equivalently, such that 4c? + 27d? 4 0 mod p. Let 
E be the elliptic curve of ordered pairs (x,y) € Zp x Zp of solutions to 
(8.1) modulo p and point at infinity O. It can be shown that the addition 
operation described above that gives E the structure of an abelian group 
can be expressed algebraically as follows. Recall first that O serves as the 
identity in the group. Now, let P = (21, y1) and Q = (a2, y2) be elements 
in E. If P = —Q, then P+ Q = O. Otherwise, if P = Q, then we define 
P + Q = (#3, y3) where 


3i +e)" 2x; mod (8.2) 
= = m ; 
T3 2y 1 P, 
3x? +c 
Yy = 1 (xı — z3)— yı mod p. (8.3) 
2y1 
And if P 4 +Q, then we define P + Q = (x3, y3) where 
2 
ga = (2 TY ) zı — z2 mod p, (8.4) 
T2 — Tı 
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Ys = (2 741 ) (xı x3) yı mod P: (8.5) 


TQ — Tı 


For small primes p, we can construct elliptic curves over Zp by trial 
and error. Let p be a prime with p > 3, and suppose c,d € Zp such that 
4c? +27d? £0 mod p. We can then use the following steps to construct the 
solutions to (8.1) modulo p. 


1. Determine which x € Zp have the property that z = x?+cxr+d mod p 
is a perfect square in Zp. 


2. Find all y € Zp such that y? = z mod p. 


The values in Z% that are perfect squares are called quadratic residues. 
Thus, the values of z determined in the preceding two steps are 0 and the 
quadratic residues in 2%. 


For the first preceding step, we consider the homomorphism s(y) = y? 
on Zš. Note that the kernel of s(y) is K = {a | z? = 1} = {1,-1}. 
Hence, |K| = 2, and the set Q = {z € Z} | z = s(y) for some y € 25} of 
quadratic residues in Z% has order t = P—' Next, we consider the function 
g(x) = at — 1. If z € Q, then z = y? mod p for some y € Zp. Thus, 
g(z) = 2-1 = y* — 1 = y?-1-—1 = 0 mod p by Lagrange’s Theorem. 
Hence, the t roots of g(x) are precisely the t elements in Q. We summarize 
this test in the following lemma. 


NN 


Lemma 8.1 An element z is a quadratic residue in Z} if and only if 
t 


z= =1modp. Hence, z is a perfect square in Zp if and only if z =0 or 
z= =1mod p. 


For the second preceding step, note that if z = y? mod p, it follows 

2 
that (2) = yt! = y? = z mod p. Therefore, for the second preceding 
step, if p = 3 mod 4, then we can find a square root of z by computing 


zT mod p. We summarize this in the following lemma. 


Lemma 8.2 Suppose p= 3 mod 4. If z is a quadratic residue in Zġ, then 
y= z°t mod p is a square root of z in Zġ. The only other square root of 


ee. 
z in Zo is —y. 
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In summary, let p be a prime with p > 3 and p = 3 mod 4, and suppose 
c,d € Zp such that 4c? + 27d? #0 mod p. Let E be the elliptic curve of 
ordered pairs (x,y) in Zp x Zp of solutions to (8.1) modulo p and point at 
infinity O. Then for the set Q of quadratic residues in Le 


pra 


T mod p} 


E = {(z,+y)|z=z°+cr+dEQandy=z 
U {(z,0) | a2 +cr+d=0} U {0}. 


Example 8.3 Let p = 19, and let E be the elliptic curve of ordered pairs 
(x,y) E Zp x Zp of solutions to y? = «?+2+6 modulo p and point at infinity 
O. We can construct the ordered pairs in E as follows. First, by trial and 
error, we determine the values of x in Z, for which z = x? +2 +6 mod p 
is a quadratic residue in Z}. For example, for x = 0, the value of z is 


p= 


z = 0 +0 +6 mod p = 6. Then since z= = 69 = 1 mod p, Lemma 8.1 
implies that z = 6 is a quadratic residue in Z. And for x = 1, the value 
of zis z = 1° + 1 + 6 mod p = 8. And since zT = 8? 41 mod p, Lemma 
8.1 implies that z = 8 is not a quadratic residue in Z}. By continuing 
this process, we can determine that the values of x € Zp for 
which z = x? + x + 6 mod p is a quadratic residue in Zp are x = 0, 2, 3, 
4, 10, 12, 14, and 18. Next, for each quadratic residue z in Z% we must 
find the values of y in Z% for which y? = z mod p. Since p = 3 mod 4, 
we can use Lemma 8.2 to do this. For example, for the quadratic residue 
z = 6 that results from z = 0, Lemma 8.2 implies that the square roots of 
z are z "T mod p = 6° mod p = 5 and —5. And since z = 6 results from 
x = 0, then the ordered pairs (0, +5) are in E. By repeating this process 
for each of the quadratic residues in Z5, we can determine that the ordered 
pairs (0, +5), (2, +4), (3, +6), (4, +6), (10, +16), (12, +6), (14, +16), and 
(18, +17) are all in Æ. Also, by trial and error, we can determine that the 
only value of x € Zp for which z = x? + x + 6 = 0 mod p is x = 6. Hence, 
the only additional ordered pair in E is (6,0). 


Now, suppose that we wish to compute the sum of the elements 
P = (#1,y1) = (2,4) and Q = (#2, y2) = (10,16) in Æ. Denote this sum 
by P + Q = (a3,y3). Since P # +Q, then we can use (8.4) and (8.5) 


to find x3 and y3. We first compute CASN 
v2 — 


mod p as follows. (Note: 


87! = 12 mod p since (8)(12) = 96 = 1 mod p. This inverse can be found 
by using the Euclidean algorithm as illustrated in Section 7.1.) 


URE 3 AOE 2 Gayle) “hs (19\(19) shod p= Mba: 
T2 — Tı 10 —2 
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Then, using (8.4) and (8.5), we have 


gz = 117-2-10 = 14 mod p, 
y3 = 11(2-14)-4 


II 


16 mod p. 
Hence, in this elliptic curve, (2,4) + (10,16) = (14, 16). 


Next, for the element P = (x,y) = (0,5) in E, suppose that we wish 
to compute the sum P + P = (3,y3). To do this, we can use (8.2) and 


3 2 
(8.3). We first compute 3 a mod p as follows. (Note: 107! = 2 mod p 
yı 
since (10)(2) = 20 = 1 mod p.) 


Brite  3(0)7+1 _ = = 
z 06 ` (1)(10)~* = (1)(2) mod p = 2 mod p. 


Then, using (8.2) and (8.3), we have 


r3 = 2 -—(2)(0) = 4 mod p, 
y3 = 20-4)—-5 = 6 mod p. 
Hence, in this elliptic curve, (0,5) + (0,5) = (4,6). | 


Although elliptic curves over Zp are not particularly easy to construct 
or even describe, their general structure is remarkably simple and specific. 
We summarize this general structure in the following theorem, which we 
state without proof. 


Theorem 8.3 Let E be an elliptic curve over Zp for prime p > 3. Then 
E is isomorphic to the direct product Zn, X Zn, of the additive groups Zn, 
and Zn, for some integers nı and no with najni and nə|(p — 1). 


As an example of Theorem 8.3, consider the elliptic curve E in Example 
8.3. For this elliptic curve, |E| = 18, and thus the only possible values for 
nı and nz in Theorem 8.3 are nı = 18 and nz = 1, and nı = 6 and ng = 3. 
Since it can be verified that (0,5) € E generates all of the elements in E 
(as do several other elements in Æ), then F is cyclic. Hence, the correct 
values of nı and ng are nı = 18 and nz = 1, and E is isomorphic to the 
additive cyclic group Zs. 


Consider an elliptic curve E over Zp for very large prime p. While 
constructing all of the elements in E is generally not possible, it is possible, 
although nontrivial, to compute the exact value of |E| using a well-known 
algorithm by Schoof. Although Schoof’s algorithm is beyond the scope of 
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this book, we will mention a well-known result from Hasse that can be 
stated rather simply and yields upper and lower bounds on |E]|. This result 
is commonly called Hasse’s Theorem, which we state as follows without 
proof. 


Theorem 8.4 Let E be an elliptic curve over Zp. Then 
pt+1—-2f/p < |E| < p+1+2 yp. 


We close this section by mentioning one additional fact regarding el- 
liptic curves. Recall that we began this discussion of elliptic curves by 
assuming that the underlying field F was not of characteristic 2 or 3, and 
that the cubic polynomial on the right-hand side of (8.1) had no multiple 
roots. Elliptic curves can also be defined over fields of characteristic 2 or 
3; they are just not defined as the set of solutions to an equation of the 
exact form of (8.1). Specifically, if F is a field of characteristic 2, then an 
elliptic curve over F is defined as the set of ordered pairs (x,y) € F x F of 
solutions to an equation of the form 


yY +y = e+er+d (8.6) 


and point at infinity O where c,d € F and the cubic polynomial on the 
right-hand side of (8.6) is allowed to have multiple roots. And if F is a 
field of characteristic 3, then an elliptic curve over F is defined as the set 
of ordered pairs (x,y) € F x F of solutions to an equation of the form 


yY? = r? 4+b2?+cr4+d (8.7) 


and point at infinity O where b,c,d € F and the cubic polynomial on 
the right-hand side of (8.6) is not allowed to have multiple roots. Results 
analogous to those mentioned in this section also hold for elliptic curves 
over fields of characteristic 2 or 3. 


8.4 Elliptic Curves with Maple 


In this section we show how Maple can be used to construct the elliptic 
curve E in Example 8.3 and perform the elliptic curve addition operation. 


We begin by defining the prime p = 19 and the values c = 1 and d = 6 
for the elliptic curve equation (8.1). 


> p := 19: 
> cask 
>d:=6: 


Recall that for the ordered pairs (x,y) € Zp x Zp of solutions to (8.1) 
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modulo a prime p > 3 and point at infinity O to form an elliptic curve, c 
and d must satisfy 4c? + 27d? 4 0 mod p. We verify this as follows. 
> 4*c73 + 27*d72 mod p; 


7 


Next, we store the right-hand side of (8.1) as eqn. 


> eqn := x73 + c¥x + d: 


We now generate the elements in E that are ordered pairs (x,y) € Zp x Zp 
of solutions to (8.1) modulo p. To generate these solutions we have provided 
the user-written procedure epoints, for which code is given in Appendix 
C.3. If this procedure is saved as the text file epoints in the directory from 
which we are running Maple, then we can include the epoints procedure 
in this Maple session by entering the following command. 


> read epoints; 


We can then generate the ordered pairs of solutions to (8.1) modulo p by 
entering the following command. 


> ecurve := epoints(eqn, x, infinity, p); 


ecurve := [0,5], [0,14], [2,4], [2,15], [3,6], [3,13],[4,6], [4,13], 
[6,0],[10,16],[ 10,3], [12,6], [12,13], [14,16], [14,3], [18,17], 
[18,2] 


In the preceding command, the first parameter is the right-hand side of 
(8.1), and the second parameter is the variable used in the first parameter. 
The third parameter is a numerical value that indicates the number of 
solutions to (8.1) we wish the command to generate. If this parameter 
exceeds the total number of solutions to (8.1), then the command will 
generate all of the solutions to (8.1). By using infinity for this parameter, 
we guarantee that the command will generate all of the solutions to (8.1). 
The last parameter is the prime p. 


Recall that the ordered pairs of solutions to (8.1) modulo p form all of 
the elements in E except the point at infinity O. By entering the following 
command, we attach the representation 0 for the point at infinity to the 
list ecurve of elements in Æ. 

> ecurve := ecurve, 0; 

ecurve := [0,5], [0,14], [2,4], [2,15], [3,6], [3,13], [4,6], [4,13], 

[6,0],[10,16],[ 10,3], [12,6], [12,13], [14,16], [14,3], [18,17], 
[18,2], 0 


The following nops command returns the number of elements in E. 
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> nops([ecurve]) ; 


18 


To perform the elliptic curve addition operation defined in Section 8.3, 
we have provided the user-written procedure addec, for which code is given 
in Appendix C.3. Assuming this procedure is saved as the text file addec 
in the directory from which we are running Maple, then we can include the 
addec procedure in this Maple session as follows. 


> read addec; 


We can then compute the sum of the elements [2,4] and [10,16] in E by 
entering the following command. 
> addec([2,4], [10,16], c, p); 


[14,16] 
In the preceding command, the first two parameters are the elements in E 


we wish to add. The third and fourth parameters are the value of c from 
(8.1) and the prime p. 


As another example of the addec procedure, in the next command we 
add the element [0,5] in E to itself. 
> addec([0,5], [0,5], c, p); 


4,6] 


Next, we compute the sum of the element [0,5] and the point at infinity. 
> addec([0,5], 0, c, p); 


0,5] 
And finally, we compute the sum of the elements [0,5] and [0,14] in Æ. 
> addec([0,5], [0,14], c, p); 
0 
Note that the preceding output shows, as expected, that [0,5] and 
[0,14] = [0,—5 ] are inverses of each other in E. 


We can verify that [0,5] is a cyclic generator for E as follows. We first 
assign the element [0,5] as the variable gen. 
> gen := [0,5]: 


We now construct the cyclic subgroup of E generated by gen. To do this, 
we first assign [0,5] also as the variable temp and store this element as the 
first entry in a table cgroup. 
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> temp := [0,5]: 
> pet := 1: 
> cgroup[pct] := temp: 


Then by entering the following while loop, we construct the cyclic sub- 
group of E generated by gen and place these elements as the subsequent 
entries in cgroup. More specifically, by entering the following while loop, 
we compute multiples of gen using addec and place these multiples as 
the subsequent entries in cgroup. The loop terminates when the identity 
element 0 is obtained. 

> while temp <> 0 do 


> temp := addec(temp, gen, c, p): 
> pct := pct + 1: 

> cgroup[pct] := temp: 

> od: 


> seq(cgroup[i], i = 1..pct); 


[0,5], [4,6], [2,4], [3,6], [14,3], [12, 13], [18,2], [10,3], [6,0], 
[10,16], [18,17], [12,6], [14,16], [3,13], [2,15], [4,13], 
[0,14],0 


Since the preceding output is all of the elements in E, then [0,5] is a cyclic 
generator for E. 


8.5 Elliptic Curve Cryptography 


If an elliptic curve over Z, for some prime p is used as the group G in the 
ElGamal cryptosystem, the value of p would have to be extremely large 
in order for the system to be secure. More specifically, it is commonly 
accepted that G should contain a cyclic subgroup of order at least 216° in 
order for the system to be secure. Constructing all of the elements in an 
elliptic curve over Z, for extremely large p can be very time-consuming. 
However, to use an elliptic curve E over Zp as the group G in the ElGamal 
system, it is not necessary to construct all of the elements in E. It is only 
necessary to find an element in E that has a relatively large order. Suppose 
we wish to use the ElGamal cryptosystem with an elliptic curve over Zp as 
the group G in the system to send a secret message to a colleague across an 
insecure line of communication. Then the system could proceed as follows. 


1. Our colleague begins the process by choosing a very large prime p 
and values for c and d that satisfy 4c? + 27d? 4 0 mod p. Let E be 


the elliptic curve of ordered pairs (x,y) € Zp x Zp of solutions to 
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(8.1) modulo p and point at infinity O. Our colleague then chooses 
an element a € E with relatively large order (which our colleague 
could verify by computing multiples of a in Æ). Next, our colleague 
chooses a positive integer n, computes b= na=a+a+-::--+ainE 
(note that we use the notation na for b instead of a” since the elliptic 
curve operation is an addition operation), and sends the values of p, 
c, and d and the elements a,b € E to us. 


2. Using some public method of conversion, we convert our message into 
an equivalent element w € E. We then choose a positive integer k, 
compute y = ka and z = w + kb in E, and send the elements y, z € E 
to our colleague. 


3. Our colleague can then recover w by computing z — ny in E since 


z—-ny = w+ kb-— nka = w+ kb- kb = w. 


Example 8.4 Suppose we wish to use the ElGamal cryptosystem with an 
elliptic curve over Zp as the group G in the system to send a secret message 
to a colleague across an insecure line of communication. 


1. Our colleague begins the process by choosing p = 19 (for illustration, 
we use a very small value for p in this example), c = 1, and d = 6. 
Then the elliptic curve E of ordered pairs (x,y) € Zp X Zp of solutions 
to (8.1) modulo p and point at infinity O is the elliptic curve in 
Example 8.3. Our colleague then chooses a = (0,5) € E, which, 
recall from Example 8.3, generates all of Æ. Next, our colleague 
chooses n = 4, computes b = na = 4(0,5) = (3,6) € E using the 
elliptic curve addition operation, and sends the values of p, c, and d 
and the elements a,b € E to us. 


2. Using some public method of conversion, we convert our message into 
the element w = (18,17) € E. We then choose k = 3, compute 
y = ka = 3(0,5) = (2,4) and z = w + kb = (18,17) + 3(3,6) = (14,3) 
in E using the elliptic curve addition operation, and send the elements 
y,z E€ E to our colleague. 

3. Our colleague can then recover w by computing 


z—-ny = (14,3)—4(2,4) = (14,3) — (12,6) 
= (14,3) + (12,13) 
(18, 17) 


using the elliptic curve addition operation. 


© 1999 by CRC Press LLC 


Note that to break the ElGamal cryptosystem in Example 8.4, an 
intruder would need to determine the value of n from the knowledge of a and 
b=nain E. That is, to break the ElGamal cryptosystem in Example 8.4, 
an intruder would need to solve the discrete logarithm problem (expressed 
using the additive notation na for a”) in E. Of course, because the elliptic 
curve in this example contains so few elements, an intruder could break 
the system very easily by trial and error. However, if an elliptic curve with 
an extremely large number of elements was used in the system, and the 
element a was chosen with a very large order, then it would be extremely 
difficult (time-wise) for an intruder to break the system. 


There is a practical difficulty with using an elliptic curve EF over Zp as 
the group G in the ElGamal cryptosystem. Recall that if the system is im- 
plemented as described above, then the plaintext must be converted to one 
of the elements in E before being enciphered. This obviously limits flexi- 
bility in formatting plaintexts, and could possibly require the generation of 
many elements in FE. We can avoid this difficulty by using a variation of 
the ElGamal system due to Menezes and Vanstone. Suppose as before that 
we wish to use the ElGamal cryptosystem with an elliptic curve over Zp as 
the group G in the system to send a secret message to a colleague across 
an insecure line of communication. The steps in the Menezes- Vanstone 
variation of the ElGamal system can be stated as follows. 


1. The first step is the same as in the usual ElGamal system. That is, 
our colleague chooses a very large prime p and values for c and d that 
satisfy 4c? + 27d? 4 0 mod p. Then for the elliptic curve E of ordered 
pairs (x,y) E Zp X Zp of solutions to (8.1) modulo p and point at 
infinity O, our colleague chooses an element a € E with large order 
and a positive integer n, computes b = na in E, and sends p, c, d, a, 
and b to us. 


2. We convert our message into an equivalent ordered pair of numbers 
w = (wi,we) E Z% x Z} (which does not need to be an element 
in Æ). We then choose a positive integer k, compute y = ka and 
kb = (c1,¢2) in E, and encipher our message as the ordered pair 
z = (4%, 22) E Z x Z% by computing 


z = (21,22) = (c1w1 mod p, czw2 mod p). 
We then send the ordered pairs y and z to our colleague. 
3. Our colleague can first recover the ordered pair kb = (c1,c2) by 


computing ny in E since ny nka kna kb in E. Our col- 
league can then recover the message w = (w1, w2) by computing 
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(cy zı mod p, cy +z. mod p) since 


(cj iz mod p, C3 22 mod p) = (cy eaw mod p, C5 c2w2 mod p) 


= (w1, w2). 


(Note: The multiplicative inverses cy! and c3" modulo p can be 
found in general by using the Euclidean algorithm as illustrated in 
Section 7.1.) 


Example 8.5 Suppose we wish to use the Menezes-Vanstone variation of 
the ElGamal cryptosystem with an elliptic curve over Zp as the group G in 
the system to send a secret message to a colleague across an insecure line 
of communication. 


1. Our colleague chooses the values p = 19, c = 1, and d = 6 so that 
the elliptic curve E of ordered pairs (x,y) € Zp x Zp of solutions to 
(8.1) modulo p and point at infinity O is the elliptic curve in Example 
8.3. Our colleague then chooses a = (0,5) € E and n = 4, computes 
b = na = (3,6) € E, and sends p, c, d, a, and b to us. 


2. We convert our message into the ordered pair w = (5,13) € Zp x Zp 
(which, note from Example 8.3, is not an element in FE). We then 
choose k = 3, compute y = ka = (2,4) and kb = (12,6) in E, and 
encipher our message by computing 


z = ((12)(5) mod p, (6)(13) mod p) = (3,2). 
We then send y and z to our colleague. 


3. Our colleague can first recover kb by computing ny = (12,6) in EF. 
Our colleague can then recover w by computing 


((12~*)(3) mod p, (6~*)(2) mod p) 


(8)(3) mod p, (16)(2) mod p) 
5,13) . 


(Note: 1271 = 8 mod p since (12)(8) = 96 = 1 mod p, and 
671 = 16 mod p since (6)(16) = 96 = 1 mod p.) 
a 


Note that with the elliptic curve E in Example 8.3, the usual ElGa- 
mal system allows only |E| = 18 possible plaintexts, while the Menezes- 
Vanstone variation of the system allows |Z> |? = 324 possible plaintexts. 
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8.6 Elliptic Curve Cryptography with Maple 


In this section we show how Maple can be used to do the computations in an 
example of the Menezes-Vanstone variation of the ElGamal cryptosystem 
with an elliptic curve over Zp as the group G in the system. 


Recall that to use Lemma 8.2 in constructing an elliptic curve (which 
is employed in the user-written procedure epoints we used in Section 8.4 
and will use again in this section), the prime p must satisfy p = 3 mod 4. 
We begin this section by entering the following procedure, which creates a 
Maple command for this session with the name p3mod4. This procedure 
is designed to quickly generate a large prime p with p = 3 mod 4. 

> p38mod4 := proc(s) 


> local t; 

> t := nextprime(s); 

> while t mod 4 <> 3 do 
> t := nextprime(t); 
> od: 

> RETURN (t); 

> end: 


The Maple procedure defined by the preceding commands takes as its input 
an integer s and returns the smallest prime p larger than s that satisfies 
p = 3 mod 4. For example, the following command defines p as the smallest 
prime larger than 220532496293778805800 that satisfies p = 3 mod 4. We 
will use this prime in our example. 

> p := p3mod4(220532496293778805800) ; 


p := 220532496293778805891 


For this value of p, let E be the elliptic curve of ordered pairs (x, y) € Z)x Zp 
of solutions to (8.1) modulo p and point at infinity O with c= 1 and d = 6. 
In the following commands we define these values for c and d, and verify 
that they satisfy 4c? + 27d? 4 0 mod p. 

>c := 1: 

>d := 6: 

> 4*c73 + 27*d^2 mod p; 

976 

Next, we store the right-hand side of (8.1) as eqn. 

> eqn := x^3 + c*x + d: 
For the ordered pair a in the system, we will use the first solution to (8.1) 


generated by the user-written procedure epoints that was introduced in 
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Section 8.4. We define this element next. 
> read epoints; 


> a := epoints(eqn, x, 1, p); 
a := [0,56750407271085204502 | 


We could easily verify that this element a has relatively large order in E by 
repeatedly applying the user-written procedure addec that was introduced 
in Section 8.4 for adding elliptic curve elements. We will not include this 
verification here. 


Next, we define the following value for n that we will use to construct 
the ordered pair b = na in E. 


> n := 91530873521338: 


To expedite the process of adding a to itself n times using the elliptic curve 
addition operation, we have provided the user-written procedure elgamal, 
for which code is given in Appendix C.3. If this procedure is saved as the 
text file elgamal in the directory from which we are running Maple, then 
we can include the elgamal procedure in this Maple session by entering 
the following command. (Note: Because elgamal calls and uses the addec 
procedure that was introduced in Section 8.4, the addec procedure must 
be saved as the text file addec in the same directory as elgamal.) 


> read elgamal; 


We can then construct the ordered pair b = na by entering the following 
command. 


> b := elgamal(a, n, c, p); 
[88936959893700554040, 106879392491870047319 | 


The parameters in this command are the ordered pair a, the multiple n of 
a we are computing, the value of c from (8.1), and the prime p. 


Next, we define the following value for k that we will use to construct 
the ordered pairs y = ka and kb in E. 


> k := 431235145514: 


We can then construct the ordered pairs y = ka and kb in E as follows. 
> y := elgamal(a, k, c, p); 


[41921046194776811649, 52283417773968786897 | 


> kb := elgamal(b, k, c, p); 


[88498850550708417382, 90428938891656008815 ] 
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We now use the ordered pair kb to encipher the message, “REN- 
DEZVOUS AT NOON”. We first apply the correspondence a from Chapter 
6 to convert this message into a list of two-digit integers. Using a, this 
message converts into the following list of integers: 17 04 13 03 04 25 
21 14 20 18 00 19 13 14 14 13. Next, we group these integers into two 
blocks of equal length, and place these blocks as entries in the following 
ordered pair w. 

> w := [1704130304252114, 2018001913141413]: 


We can then encipher the message by entering the following command. 
> z := [ kb[1]*w[1] mod p, kb[2]*w[2] mod p ]; 


z := [79041720375143250245, 25557336104884537057 | 


To decipher the message, we first recover the ordered pair kb by computing 
ny in E as follows. 
> ny := elgamal(y, n, c, p); 


[88498850550708417382, 90428938891656008815 | 


We can then decipher the message by entering the following command. 
> [ (y[1]*(-1)*z[1]) mod p, (ny[2]*(-1)*z[2]) mod p ]; 


[1704130304252114, 2018001913141413] 
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Written Exercises 


1. Suppose you wish to use the ElGamal cryptosystem with a group of 
the form Z% for some prime p as the group G in the system to send 
a secret message to a colleague across an insecure line of communica- 
tion. Your colleague sends you the values p = 31, a = 18, and b= 9, 
and you convert your message into the numerical equivalent w = 20. 
Using the value k = 6, construct the values of y and z you would then 
send to your colleague. 


2. Suppose you wish to receive a secret message across an insecure line 
of communication from a colleague using the ElGamal cryptosystem 
with a group of the form Z% for some prime p as the group G in 
the system. You send your colleague the values p = 13, a = 2, and 
b = 23 = 8 mod p, and your colleague converts his or her message 
into a numerical equivalent w and returns to you the values y = 5 
and z = 2. Decipher the message (recover w). 


3. Suppose you wish to use the ElGamal cryptosystem with the group 
of nonzero elements in a finite field as the group G in the system 
to send a secret message to a colleague across an insecure line of 
communication. Your colleague sends you the primitive polynomial 
p(x) = £? +x +2 € Z;5[2] and the polynomials a = x and b = 4z in 
G, and you convert your message into the element w = 24 +4 € G. 
Using the value k = 6, construct the polynomials y and z you would 
then send to your colleague. 


4. Suppose you wish to receive a secret message across an insecure line 
of communication from a colleague using the ElGamal cryptosystem 
with the group of nonzero elements in a finite field as the group G in 
the system. You send your colleague the primitive polynomial 
plz) = z? + x + 2 € Zs|xz] and the polynomials a = x and 
b = zë = 3x + 1 in G, and your colleague converts his or her message 
into an element w € G and returns to you the polynomials y = 2x 
and z = 4x + 4. Decipher the message (recover w). 


5. Let E be the elliptic curve of ordered pairs (x,y) € Zıı X Zıı of 
solutions to y? = x? + z + 1 modulo 11 and point at infinity O. 

(a) Construct the elements in E. 

(b) Compute the sum (3,8) + (4,6) in E. 

(c) Compute the sum (1,6) + (1,6) in E. 

(d) Compute the sum (1,6) + (1,5) in E. 
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6. Let E be the elliptic curve of ordered pairs (x,y) € 223 x Z23 of 
solutions to y? = x? + z + 7 modulo 23 and point at infinity O. 


(a) Use Theorem 8.3 to show that E is cyclic. (Note: |E| = 18.) 
(b) Use Theorem 8.4 to find upper and lower bounds on |Æ]. 


7. Let E be the elliptic curve of ordered pairs (x,y) € Zıı X Zıı of 
solutions to y? = xz? + 2x modulo 11 and point at infinity O. 


(a) Construct the elements of E. 
(b) Is E cyclic? State the structure of E given by Theorem 8.3. 


8. Suppose you wish to use the usual ElGamal cryptosystem with an 
elliptic curve over Zp for some prime p as the group G in the system 
to send a secret message to a colleague across an insecure line of 
communication. Your colleague sends you the elements a = (8,9) 
and b = (1,6) in the elliptic curve E in Written Exercise 5, and you 
convert your message into the element w = (4,6) € E. Using the 
value k = 2, construct the elements y,z E€ E you would then send to 
your colleague. (Hint: 7~' = 8 mod 11.) 


9. Suppose you wish use the Menezes- Vanstone variation of the ElGamal 
cryptosystem with an elliptic curve over Zp for some prime p as the 
group G in the system to send a secret message to a colleague across 
an insecure line of communication. Your colleague sends you the 
elements a = (8,9) and b = (1,6) in the elliptic curve E in Written 
Exercise 5, and you convert your message into the ordered pair 
w = (5,7). Using the value k = 2, construct the ordered pairs y € E 
and z you would then send to your colleague. (See hint at end of 
Written Exercise 8.) 


10. Suppose you wish to receive a secret message across an insecure line 
of communication from a colleague using the Menezes-Vanstone vari- 
ation of the ElGamal cryptosystem with an elliptic curve over Zp for 
some prime p as the group G in the system. You send your colleague 
the elements a = (4,6) and b = 2a = (6,6) in the elliptic curve E 
in Written Exercise 5, and your colleague converts his or her mes- 
sage into an ordered pair w and returns to you the ordered pairs 
y = (1,6) and z = (10,10). Decipher the message (recover w). (Hint: 
371 = 4 mod 11, and 87! = 7 mod 11.) 


11. Recall that for the set of ordered pairs (x,y) € Zp X Zp of solutions 
to (8.1) modulo a prime p > 3 and point at infinity O to be an ellip- 
tic curve, the values of c and d must satisfy 4c? + 27d? #4 0 mod p. 
To demonstrate the importance of this condition, use the elliptic curve 
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addition operation to add the ordered pairs (0,1) and (14,0) of solu- 
tions to the equation y? = z? + x + 1 modulo 31. Explain why your 
answer shows the importance of the condition 4c? + 27d? 4 0 mod p 
in the definition of an elliptic curve over Z, for prime p > 3. 


Maple Exercises 


1. Suppose you wish to use the ElGamal cryptosystem with a group of 
the form Z% for some prime p as the group G in the system to send 
a secret message to a colleague across an insecure line of communica- 
tion. Your colleague sends you the values p = 10000000019, a = 132, 
and b = 240246247, and you convert your message into the numerical 
equivalent w = 2324123. Using the value k = 398824116, construct 
the values of y and z you would then send to your colleague. 


2. Suppose you wish to receive a secret message across an insecure line 
of communication from a colleague using the ElGamal cryptosystem 
with a group of the form Z% for some prime p as the group G in 
the system. You send your colleague the values p = 10000000019, 
a = 132, and b = a” = 5803048419 mod p with n = 121314333, and 
your colleague converts his or her message into a numerical equivalent 
w and returns to you the values y = 9054696956 and z = 7432712113. 
Decipher the message (recover w). 


3. Suppose you wish to use the ElGamal cryptosystem with the group 
of nonzero elements in a finite field as the group G in the system 
to send a secret message to a colleague across an insecure line of 
communication. Your colleague sends you the primitive polynomial 
p(x) = 3x7 + 4a + 1 € Zs[x] and the polynomials a = x and 
b = 3x° + zt + 223 + 4x in G, and you convert your message into 
the element w = 22° + 42° + £z? +241 G. Using the value 
k = 1851, construct the polynomials y and z you would then send to 
your colleague. 


4. Suppose you wish to receive a secret message across an insecure line 
of communication from a colleague using the ElGamal cryptosystem 
with the group of nonzero elements in a finite field as the group G 
in the system. You send your colleague the primitive polynomial 
p(z) = 3a’ + 42 + 1 € Z;[z] and the polynomials a = x and 
b = x” = 2g + 32° +1 in G with n = 51801, and your colleague 
converts his or her message into an element w € G and returns 
to you the polynomials y = zê + 4x + 324 + xz? + x + 2 and 


© 1999 by CRC Press LLC 


z = 22% + 2x5 + 3x4 + 3x? + 2x? + 32 +2. Decipher the message 
(recover w). 


5. Let E be the elliptic curve of ordered pairs (x,y) E€ Zs59 X Zsq of 
solutions to y? = x? + 31x +21 modulo 59 and point at infinity O. 
(a) Construct the elements in E. 


(b) Use Theorem 8.3 to show that E is cyclic. Then explain why 
every element in E except O will be a cyclic generator for E. 


(c) Compute the sum (42,3) + (54,6) in E. 
(d) Compute the sum (42,3) + (42,3) in E. 
(e) Compute the sum (42,3) + (42,56) in E. 


6. Set up a parameterization of the Menezes-Vanstone variation of the 
ElGamal cryptosystem using an elliptic curve over Zp for some prime 
p with at least 25 digits as the group G in the system. Then use this 
parameterization of the ElGamal system to encipher and decipher 
the message, “TARGET HIT SEND NEW ORDERS”. (Use the corre- 
spondence a from Chapter 6 to convert the message into numerical 
form.) 
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Chapter 9 


Polya Theory 


In this chapter we discuss some results for counting orbits when a group 
acts on a set. Because the most celebrated result we mention is the Polya 
Enumeration Theorem, we will refer to the theory we discuss in this chapter 
as Polya theory. 


We begin by stating a very simple example of the type of problem we 
consider in this chapter. Suppose we wish to construct a necklace with four 
colored beads, and that each bead can be either blue or green. If we assume 
that the beads can be rotated around the necklace, and that the necklace 
can be flipped over and worn, then how many different necklaces can we 
construct? To answer this question, suppose we stretch the necklace into 
the shape of a square with one bead at each corner. The following figures 
show the set X of 16 possible arrangements for the beads. 


B G GG G B G 45 B B aG G G 
G B G G G G G G 
G 3 G G 4, B B i G Gig G 
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Of course, not all of these arrangements yield different necklaces. By rotat- 
ing the beads around the necklace we can see that arrangements 2 through 
5 are really the same necklace. Likewise, by flipping the necklace over we 
can see that arrangements 6 and 8 are really the same necklace. The move- 
ments of rotating the beads around the necklace and flipping the necklace 
over are called rigid motions of the necklace. We can also view these rigid 
motions as motions of the single figure 


Ed 


2 3 


or, more specifically, motions of the set S of vertices of this figure. Note that 
each rigid motion of the necklace permutes the elements in X and S. Thus, 
we can represent these rigid motions by their permutations on X or S. We 
will use the permutations on S' to answer questions like the one we posed at 
the start of this example. The advantage to using the permutations on S 
rather than X is that there are 16 elements in X, but only 4 elements in S. 
And this reduction in size would be much more important if the necklace 
contained more beads or if more colors were available for each bead. For 
example, if the necklace contained four beads but five colors were available 
for each bead, then there would be 54 = 625 possible arrangements of the 
beads, but still only four vertices of the preceding general figure. 


9.1 Group Actions 


Recall that the set of rigid motions of a square forms a group with the 
operation of composition. In the following table we list the elements in 
this group G along with their permutations on S' expressed as cycles. The 
rotations are counterclockwise. (Note that we include all cycles of length 
one. The significance of this will be apparent in Section 9.3.) 


Element in G Permutation on S$ 
mı = 90° rotation (1234) 
T2 = 180° rotation (13) (24) 
T3 = 270° rotation (1432) 


m4 = reflection across horizontal 


( 

T5 = reflection across vertical (14) (23) 
Te = reflection across 1-3 diagonal (24)(1)(3) 
T7 = reflection across 2—4 diagonal (13)(2)(4) 

Ts = identity (1)(2)(3)(4) 
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Expressing the elements in G as permutations on X would in general require 
much longer notation. For example, the 90° rotation would be represented 
as a permutation on X by (1)(2, 5, 4,3)(6, 9,8, 7)(10, 11)(12, 15, 14, 13) (16). 
Since it is true that each rigid motion m € G corresponds to unique per- 
mutations on S and X, in this chapter we will often write m when we 
mean to refer to one of the permutations. For example, by writing 71, we 
could mean the 90° rotation, the permutation (1234) on S, or the permu- 
tation (1)(2,5,4,3)(6, 9, 8,7)(10, 11)(12, 15, 14, 13)(16) on X. The context 
will make it clear which one we intend. 


We now formalize our necklace example. Let S be a collection of 
objects, and let R be a set of elements called colors (not necessarily colors 
in the usual sense). A coloring of S by R is an assignment of a unique color 
to each element in S. That is, a coloring of S by R is a function f : S > R. 
Note that if |S| = n and |R| = m, then there will be m” distinct colorings 
of S by R. We will denote by X the set of colorings of S by R. The set X 
of 16 possible arrangements of the beads in our necklace example is the set 
of 16 colorings of S = {vertices of a square} by R = {blue, green}. 


Now, consider a group G and a set Y. An action of G on Y isa 
mapping Y x G — Y such that 


1. y(gh) = ((y)g)h for all y € Y and g, hE G, 


2. (y)1 = y for all y € Y, where 1 represents the identity in G. 


In our necklace example, the group G = {rigid motions of a square} acts 
on both S = {vertices of a square} and X = {colorings of S by R} where 
R = {blue, green}. As illustrated in this example, when a group G acts on 
a set Y, each element in G can be represented as a permutation on Y. 


Lemma 9.1 Suppose a group G acts on a set Y. For any x,y € Y, define 
x ~ y if there exists g E€ G for which (x)g = y. Then ~ is an equivalence 
relation. 


Proof. Exercise. | 


As a consequence of Lemma 9.1, when a group G acts on a set Y, the 
set is decomposed into equivalence classes of elements that can be mapped 
to each other by elements in Œ. These equivalence classes are called orbits. 
When Y is a set of colorings, these orbits are also called patterns. The gen- 
eral type of problem we consider in this chapter can be viewed as counting 
the number of patterns when a group acts on a set of colorings. 
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In summary, suppose S is a set, R is a set of colors, and X is the set 
of colorings of S by R. When a group G acts on S, G also acts on X by 
((x) f)m = ((a)7) f for all x € S, f € X, and a € G. Two colorings f,g € X 
are equivalent if there exists 7 € G such that ((a)f)m = (a)g for alla € S. 
Hence, two of the 16 colorings in our necklace example are equivalent if there 
is a rigid motion of a square that maps one to the other. To answer the 
question that began our necklace example, we must only count the number 
of patterns under this equivalence. From the list of colorings shown at the 
beginning of this chapter we can easily see that there are six such patterns: 
{1}, {2,3,4,5}, {6,7,8,9}, {10,11}, {12,13,14,15}, and {16}. With only 
16 possible arrangements for the beads, it is not necessary to consider the 
group action of G on S or X to count the patterns. However, it would 
certainly not be practical to list all of the possible arrangements for the 
beads if the necklace had 10 beads and 12 colors were available for each 
bead. In this chapter we discuss how the idea of a group action can be used 
to count patterns without actually constucting the patterns. 


9.2 Burnside’s Theorem 


Our goal in this chapter is to count the number of patterns when a group 
acts on a set of colorings. Counting the number of orbits when a group 
acts on a set is the focus of a fundamental result from Burnside. Before 
establishing this result, we first define some additional terms. 


Suppose a group G acts on a set Y. Then for each element m € G, 
we denote by Fix(r) the set of elements in Y that are fixed by m. That is, 


Fix(7) = {y E€ Y | (y)t = y}. 


Example 9.1 Consider G acting on X in our necklace example. Using the 
notation m; defined at the start of Section 9.1 for the elements in G, and 
the enumeration from the beginning of this chapter for the colorings in X, 
we list Fix(;) for each 7; € G in the following table. 


T% EG Fix(7;) |Fix(7;)| 
mi 1, 16 2 
T 1, 10, 11, 16 4 
73 1, 16 2 
4 1, 7, 9, 16 4 
ts 1, 6, 8, 16 4 
Te 1, 2, 4, 10, 11, 12, 14, 16 8 
T7 1, 3, 5, 10, 11, 13, 15, 16 8 
Ts X 16 
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Suppose again that a group G acts on a set Y. Then for each element 
y € Y, we denote by Stab(y) the subgroup of elements in G that fix y. 
That is, Stab(y) = {r € G | (y)r = y}. 


Example 9.2 Consider again G acting on X in our necklace example. We 
list Stab(x) for each x € X in the following table. 


rex Stab(z) |Stab(a)| 
1 G 8 
2 T6, Tg 2 
3 17, T8 2 
4 N6, Tg 2 
5 N7, Tg 2 
6 N5, Tg 2 
T T4, T8 2 
8 N5, Tg 2 
9 T4, T8 2 
10 T2, Te, N7, Tg 4 
11 T2, Te, N7, Tg 4 
12 N6, Tg 2 
13 T7, Tg 2 
14 N6, Tg 2 
15 T7, Tg 2 
16 G 8 


Note that the sum of the entries in the |Fix(r;)| column in Example 
9.1 and sum of the entries in the |Stab(a)| column in Example 9.2 are both 
48. This equality is guaranteed in general by the following lemma. 


Lemma 9.2 If a group G acts on Y, then 5 |Fiz(r)| = 5 |Stab(y)|. 
TEG yeY 


Proof. Exercise. (Let S = {(y,7) | (yt = y, y € Y, m € G}, and 
count || in two ways — first by ranging through the possibilities for y, and 
then by ranging through the possibilities for m.) E 


Suppose again that G acts on Y. Then for y € Y we denote the orbit 
of y by Orb(y). That is, Orb(y) = {x € Y | x = (y)m for some m € G}. 


Lemma 9.3 If a group G acts on Y, then |G| = |Stab(y)| - |Orb(y)| for 
eachy EY. 
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Proof. Suppose g and h are in the same right coset of Stab(y). Hence, 
g = Th for some m € Stab(y). Thus, (y)g = (y)th = (yr)h = (y)h. On 
the other hand, suppose (y)g = (y)h for some g,h € G. Then y = (y)hg™', 
and hg~' € Stab(y). Therefore, hg~! = m for some m € Stab(y). Hence, 
h = 7g, and h and g are in the same right coset of Stab(y). In summary, g 
and h are in the same right coset of Stab(y) if and only if (y)g = (y)h. Thus, 
there is a bijection between the right cosets of Stab(y) and the elements in 
Orb(y). Using this and Lagrange’s Theorem (Theorem 1.4), we conclude 


|G| =  |Stab(y)|- (number of right cosets of Stab(y)) 


=  [Stab(y)|- |Orb(y)| . 
E 


We now establish the following fundamental result for counting orbits 
when a group acts on a set. This result is due to Burnside, and thus we 
will call it Burnside’s Theorem. 


Theorem 9.4 Suppose a group G acts on a set Y. Then the number of 


1 
orbits in Y is —- Xo |Fiz(r)|. 
IG| TEG 
Proof. Dividing both sides of the equation in Lemma 9.2 by |G| yields 


1 . 1 
L Y [Fix(n)| = + Y |Stab(y)] . 
IG| |G| 
TEG yeY 
And by Lemma 9.3, we know that 
ES (stab) |= oe 
—=_ y = . 
aa 2 JOU) 


Suppose there are s orbits in Y, which we denote by O1, O2,...,O;. Then 
if x,y € Oj, it follows that |Orb(x)| = |Orb(y)| = |O;|. But then 


3 eee eee eee 
(Orb), ~~ 10} | JO, 


To see how Theorem 9.4 can be applied, consider G acting on X in our 


necklace example. From Example 9.1, we can see that XS |Fix(r)| = 48. 
TEG 
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Then, since |G| = 8, it follows from Burnside’s Theorem that the number of 
orbits in X is 48/8 = 6. That is, as we have already seen, there are 6 distinct 
necklaces in our necklace example. While this result is certainly correct, in 
practice the result of Burnside’s Theorem is not usually determined exactly 


in this manner. Specifically, in practice the value of 5 |Fix(r)| is not 

TEG 
usually determined by actually constructing the sets Fix(r) as we did in 
Example 9.1. To construct the table in Example 9.1 we referenced the 
list of all possible necklace arrangements shown at the beginning of this 
chapter. However, recall that in general we would like to be able to count 
orbits without having to list all of the possible arrangements. We discuss 
a method for doing this next. 


9.3. The Cycle Index 


In our necklace example, consider the rigid motion 77 = reflection across 
2-4 diagonal. Note that if x € Fix(m7), then x must have the same color 
bead at vertices 1 and 3, but can have any color bead at vertices 2 and 
4. Hence, since two colors are available for the beads, then there will be 
2-2-2 = 8 colorings fixed by 77. Thus, we can determine that |Fix(77)| = 8 
without having to reference the list of all possible necklace arrangements 
shown at the beginning of this chapter. And we could determine |Fix(77)| 
in the same way if more than two colors were available for the beads. For 
example, if five colors were available for the beads, then there would be 
5-5-5 = 125 colorings fixed by 7. Note that |Fix(7)| depends only on the 
number of colors available for the beads and the number of sets of vertices 
that can take arbitrary colors. Specifically, if a colors are available for the 
beads, then |Fix(77)| = aê. 


The preceding discussion can be generalized as follows. Suppose 7 is 
a rigid motion for which k sets of vertices can take arbitrary colors. Then 
if a colors are available, it follows that |Fix(7)| = a*. It is easy to see 
the number of sets of vertices that can take arbitrary colors from the cycle 
representation of 7 as a permutation on the set S of vertices. For example, 
recall that in our necklace example the rigid motion 77 can be represented 
as the permutation (13)(2)(4) on S. Since there are three disjoint cycles in 
this representation for 77, then there will be three factors of a in |Fix(77)]. 
In general, if there are k disjoint cycles in the representation of m as a 
permutation on S$ and a colors are available, then |Fix(7)| = a*. This 
relates to the material in Section 9.2 because it provides us with a way to use 
Burnside’s Theorem for counting patterns without having to refer to a list 
of all of the possible arrangements. Specifically, it states that the sum in the 
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formula in Burnside’s Theorem can be expressed as 5 |Fix(r)| = 5 a", 
TEG TEG 
where ky is the number of disjoint cycles in the representation of 7 as a 


permutation on S. 


Example 9.3 Consider G acting on S$ in our necklace example. From the 
table at the start of Section 9.1 we can see that the number of disjoint 
cycles in the representations of 7; as permutations on S$ are 1, 2, 1, 2, 2, 3, 
3, and 4, respectively. Thus, with two colors available for the beads, 


XO |Fix(r)| = 2+ 27+4 2" +27 427 423 +23 + 24 = 48. 
TEG 


And if five colors were available for the beads instead of just two, then 


XO |Fix(m)| = 5! +5? +5! +5? +5? + 5° +53 +54 = 960. 


TEG 
; 1 . 1 
Hence, by Burnside’s Theorem there are iq 5 |Fix(7)| = g 960 = 120 
TEG 
distinct necklaces if five colors are available for the beads. E 


This process for computing 5 |Fix(7)| can be refined as follows. Sup- 


TEG 
pose 7 € G is a rigid motion that, when acting on S, is represented by the 
product of disjoint cycles of lengths 71, t2, ..., i+. We then associate with 7 


the monomial fr = Zi, Zis + Li. For example, with 77 = (13)(2)(4) in our 
necklace example, we associate the monomial fr, = 72%121 = (#1)?x2. We 
then define the cycle index of G acting on § as 


f idari eis Lg) =. a Xo fr 


TEG 


where w is the length of the longest cycle in the representation of any 7 € G 
as a permutation on S. The cycle index is of interest to us because of the 
following theorem, which states how it can be used to count orbits when a 
group acts on a set. 


Theorem 9.5 Let S be a set, let R be a set of colors, and let X be the 
set of colorings of S by R. Suppose a group G acts on S with cycle index 
f(£1,£2,..., £w). If |R| = a, then the number of patterns in X under the 
corresponding action of G on X is f(a,a,...,a). 


Proof. Exercise. E 
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Example 9.4 For our necklace example we list the monomial fr for each 
am € G in the following table. 


TEG tr 
(1234) z4 
(13) (24) (22)? 
(1432) v4 
(12) (34) (x2)? 
(14) (23) (x2)? 
(24)(1)(3) (x1)?a2 
(13)(2)(4) (x1)?a2 
(1)(2)(3)(4) (aye 


Thus, the cycle index for this example is 


f(@1,%2,23,%4) = 8 (2x4 + 3x3 + 2x} x2 4 xÎ)- 


Hence, if two colors are available for the beads, then there will be 
f(2,2,2,2) = 4(4+12 + 16 + 16) = 6 distinct necklaces. And if five colors 
were available for the beads instead of just two, then there would be 
f(5,5,5,5) = 3(10 +75 + 250 + 625) = 120 distinct necklaces. E 


Example 9.5 Suppose we wish to construct a necklace with six colored 
beads. As in the 4-bead necklace example, we assume that the beads can 
be rotated around the necklace, and that the necklace can be flipped over 
and worn. In this example, we use a cycle index to determine the number 
of distinct necklaces we can construct with a specified number of colors 
available for each bead. To do this, suppose we stretch the necklace into 
the shape of a hexagon with one bead at each corner. Consider the following 
general shape for the necklace. 


Let G be the set of rigid motions of a hexagon, and let S be the set of 
vertices of the preceding general figure. In the following table, we list the 
elements 7 € G, their cycle representations as permutations on S, and the 
associated monomials fr. 
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TEG Permutation on S fr 


60° rotation (123456) £e 
120° rotation (135) (246) (x3)? 
180° rotation (14)(25)(36) (x2)3 
240° rotation (153) (264) (x3)? 
300° rotation (165432) £e 
reflection fixing 2,5 (13)(46)(2)(5) (x1)? (£2)? 
reflection fixing 1,4 (26)(35)(1)(4) (£1)? (£2)? 
reflection fixing 3,6 (15)(24)(3)(6) (£1)? (£2)? 
reflection across vertical (16)(25)(34) (x2)? 
reflection across diagonal (23) (14) (56) (x2)? 
reflection across diagonal (12) (36) (45) (x2)3 
identity (1)(2)(3)(4)(5)(6) (z1) 
Thus, the cycle index for this example is 
f (£1, £2, £3, £4, £5, £6) = 2 (2x6 + 2x3 + 4x3 + 3x73 + x$) A 


12 


Hence, if two colors are available for the beads, then there will be 
f(2,2,2,2,2,2) = (4 +8 + 32 + 48 + 64) = 13 distinct necklaces. And if 
five colors are available for the beads instead of just two, there will be 
f(5,5,5,5,5,5) = 35 (10 + 50 + 500 + 1875 + 15625) = 1505 distinct neck- 
laces. a 


9.4 The Pattern Inventory 


From Example 9.5, we can see that 1505 distinct necklaces can be con- 
structed with six beads if five colors are available for each bead. Consider 
now the following question. How many of these 1505 distinct necklaces have 
beads with only three of the five possible colors? Or, more specifically, if 
the colors available for the beads are blue, green, red, white, and yellow, 
then how many of these 1505 distinct necklaces have exactly two red beads, 
three white beads, one yellow bead, and no blue or green beads? In this 
section, we discuss a way to answer such questions. 


Let S be a set, let R be the set {C1,C2,...,C;} of colors, and let X 
be the set of colorings of S by R. Suppose a group G acts on S with cycle 
index f(£1,£2,..., £w). Then the simplified symbolic expression 


F(C tee + Cy, CF be + OP eas , CP +--+ + C7”) 
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is called the pattern inventory of X. The pattern inventory of X allows 
us to answer questions like those posed at the start of this section. This 
is due to the following theorem, commonly called the Polya Enumeration 
Theorem. 


Theorem 9.6 Suppose the monomial KORO? ee Cr appears in the pat- 
tern inventory of X. Then there are k patterns in X in which Cy appears 
1, times, Cg appears ig times, ... , and Cı appears i, times. 


Because the proof of the Polya Enumeration Theorem is extensive, 
before verifying this theorem we first show how it can be applied in our 
4-bead necklace example. 


Example 9.6 Consider our 4-bead necklace example with cycle index 


1 
f(£1, £2, £3, £4) = (2x4 + 3x3 + 2x722 +27). 
Suppose that each bead can be either B = blue or G = green. Then the 
pattern inventory of the set X of colorings is 


f (B +G, B? + G’, B? + G8, Bt +G’) 
1 


= z025 + G*) + 3(B? + G?)? + 2(B + G)?(B? + G°) 


+(B+G)*) 
= Bt + BG+2B’°G?+ BGÌ + G4. 


From this pattern inventory, we can easily see the number of distinct 4- 
bead necklaces that have prescribed numbers of blue and green beads. For 
example, because the term BG? appears in this pattern inventory with a 
coefficient of 1 and exponents of 1 on B and 3 on G, then there is only one 
distinct 4-bead necklace with one blue bead and three green beads. And 
because the term 2B?G? appears in this pattern inventory with a coefficient 
of 2 and exponents of 2 on B and G, then there are two distinct 4-bead 
necklaces with two blue beads and two green beads. 


Now, suppose that each bead can also be R = red. Then the pattern 
inventory of the set of colorings is 
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f(B +G + R, B? + G? + R?, B? +G? + R?, Bt + G4 +R’) 


= =(2(B4 + G4 + R!) +3(B? +O? + R?)? +--+ (B4+G+ RY) 


= Bt+ BG+2B’°G? + BG? + G4 + BR+2B*R? + BR? + R* 
+G°R+2G?R? + GR? + 2BGR? +2BG?R+2B°GR. 


For example, because the term 2BGR? appears in this pattern inventory, 
then there are two distinct 4-bead necklaces with one blue bead, one green 
bead, and two red beads. Note that by adding the coefficients of all of the 
terms in this pattern inventory, we see that there are 21 distinct 4-bead 
necklaces if three colors are available for the beads. Also, note that by 
adding the coefficients of just the last ten terms in this pattern inventory, 
we see that 15 of these 21 distinct necklaces have at least one red bead. 
Finally, as we would expect, note that each term in the 2-color pattern 
inventory is present in the 3-color pattern inventory. ] 


A pattern inventory can be used to answer both of the questions posed 
at the start of this section. Specifically, consider the 6-bead necklace ex- 
ample with cycle index 


f(£1, £2, £3, £4, £5, £6) = 5 (226 +22? + 403 + 32723 + z1) : 
Suppose that each bead can be B = blue, G = green, R = red, W = white, 
or Y = yellow. We showed in Example 9.5 that 1505 distinct necklaces 
can be constructed with six beads if five colors are available for each bead. 
Of these 1505 necklaces, the number that have two red beads, three white 
beads, one yellow bead, and no blue or green beads will be the coefficient 
of R?W®Y in the pattern inventory 


f(B+G+R4W+HY....... , BË + GÊ + RÊ + W° + Y®). 


Of course, it would not be easy to compute this pattern inventory by hand. 
However, with the help of a symbolic manipulator like Maple, this pattern 
inventory is very easy to compute. We show how Maple can be used to 
compute pattern inventories in Section 9.5. 


We close this section with a discussion of why the Polya Enumeration 
Theorem is true. Rather than giving a formal proof of the theorem, which 
would be complicated and not intuitive, we give an informal discussion of 
why it is true with two colors. This discussion can be generalized in an 
obvious way for more than two colors. 
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Let S be a set of s vertices, let R be the set {B = blue, G = green} of 
colors, and let X be the set of colorings of S by R. Suppose a group G of 
rigid motions acts on S with cycle index f(£1,..., £w). If m € G acts on S 
with a single cycle of length s, then for an element in X to be fixed by 7, 
each of the vertices in S must be assigned the same color. We will keep a 
record of this by writing B° + G*, which we interpret as representing the 
fact that a coloring fixed by m must have either s blue vertices or s green 
vertices. For example, with mı = (1234) in our 4-bead necklace example, 
we would write B+ + G*, which we interpret as representing the fact that a 
coloring fixed by mı must have either four blue beads or four green beads. 


Now, suppose m € G acts on S$ with two cycles of lengths sı and s2. 
Then for an element in X to be fixed by 7, all of the sı vertices represented 
in the first cycle of m must be assigned the same color, and all of the s2 
vertices represented in the second cycle of 7 must also be assigned the same 
color. We will keep a record of this by writing 


(B° + G*)(B? + G°), 


which we interpret as representing the fact that for a coloring to be fixed 
by a, the sı vertices represented in the first cycle of m must be all blue 
or all green (hence the first factor), while the sg vertices represented in 
the second cycle of m must also be all blue or all green (hence the second 
factor). Note that by expanding this expression we obtain the following. 


(B° + G*!)(B? + G°?) = Be1ts2 4 Bsz 4 Bs2Gs1 p Gsitse 


The terms on the right-hand side of this equation represent the fact that 
for a coloring to be fixed by 7, all of the vertices must be blue (hence the 
first term), or the sı vertices represented in the first cycle of m must be all 
blue and the sə vertices represented in the second cycle of m must be all 
green (hence the second term), or the sı vertices represented in the first 
cycle of m must be all green and the s2 vertices represented in the second 
cycle of m must be all blue (hence the third term), or all of the vertices 
must be green (hence the fourth term). For example, with m2 = (13)(24) 
in our 4-bead necklace example, we would write 


(B? + G?)(B? +G?) = Bt + B?G? + B?G? +c". 


The first and last terms on the right-hand side of this equation indicate 
that the colorings in our 4-bead necklace example with four blue beads and 
four green beads are fixed by 72. The middle two terms indicate that there 
are also two colorings in our 4-bead necklace example with two blue beads 
and two green beads that are fixed by 72. 


More generally, suppose 7 € G acts on S with j cycles of lengths 
$1,52,...,8;. Then for an element in X to be fixed by m, the vertices 
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represented in each of these cycles must all be assigned the same color. We 
keep a record of this by writing 


(BY + G")(BY +G")---(BY +6"), 


whose factors we interpret as representing the fact that for a coloring to 
be fixed by a, the vertices represented in each of the cycles of m must be 
all blue or all green. Recall that for use in the cycle index we attach to 7 
the monomial fr = Zs, Zs, '':Zs;. Note that the above expression can be 
viewed as f,(B+G,...,B*+G"), where t is the length of the longest cycle 
in m. As we have demonstrated, each term in the expansion of the above 
expression represents a coloring fixed by m with the distribution of colors 
given by the bases to the number of vertices specified by the exponents. 
Hence, if we combine the terms in this expansion that are similar, the 
coefficient of each resulting term will be the total number of colorings fixed 
by m with the distribution of colors given by the bases to the number of 
vertices specified by the exponents. And if we sum over all 7 € G and 
combine similar terms, the coefficient of each resulting term will be the 
total number of colorings fixed by any 7 € G with the distribution of colors 
given by the bases to the number of vertices specified by the exponents. 


We now claim that if the monomial kB“ G* appears in the pattern 
inventory of X, then there will be k patterns in X in which B appears 71 
times and G appears iz times. To see this, let Y be the subset of X that 
contains all of the colorings in which B appears 7; times and G appears iz 
times. Since G clearly acts on Y, Burnside’s Theorem states that 


Number of patterns in Y = ue 5 |Fix(z) | (9.1) 
|C TEG 

with G acting on Y. It is precisely this number of patterns that we wish 
to determine. As we have just discussed, the coefficients in the expanded 
form of f,;(B + G,..., Bt + Gt) show the number of colorings fixed by 
m (i.e., |Fix(r)|) for any m € G. Hence, if we add the BG” terms in 
fr(B+G,...,B'+ G*) for all 7 € G, the coefficient of the result will be 
the sum in (9.1). If we then divide this coefficient by |G|, this will show 
the number of patterns in Y. More generally, we can find the number of 
patterns for any possible distribution of the colors B and G by adding and 
simplifying f,;(B+G,...,B*+G*") for all r € G and dividing by |G|. The 
coefficients of the result will show the number of patterns in X with the 
distribution of colors given by the bases of the terms to the number of 
vertices specified by the exponents of the terms. Finally, note that adding 
fr(B+G,...,B'+G") for all t € G and dividing the result by |G] will 
yield exactly the cycle index f(x1,£2,..., £w) of G acting on S evaluated 
at B+6G,..., BY +G”. Since this is how we defined the pattern inventory 
of X, the result is shown. 
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9.5 The Pattern Inventory with Maple 


In this section, we show how Maple can be used to count patterns and 
construct pattern inventories. We consider the 6-bead necklace example 
with cycle index 


1 
f(@1, £2, £3, £4, £5, £6) = To (2x6 + Des +423 + 32702 + a7) ‘ 


We begin by defining this cycle index. Note that we use brackets [ ] 
to obtain the appropriate subscripts. 
> f := (1/12)*(2*x[6] + 2*x[3]72 + 4*x[2]^3 + 3*x[1]*2*x[2]*2 
> + x[1]76); 
1 1 1 


f := 5% 4 gv 4 x2? 4 qm me + 5a" 


To convert this expression into a function that we can evaluate in the usual 
manner, we enter the following unapply command.! 
> f := unapply(f, x[1], x2], x[3], x[4], x[5], x[6]); 


f:=(a-1, 2-2, 1-3, 44, 2-5, 4-6) > 


1 1 1 1 1 
g tb + gee } ge P+ pet ee + agt" 


Although the preceding command changes the variables in f from x; to 
x-i, this has no effect on how f can be used. For example, we can find the 
number of distinct 6-bead necklaces if two colors are available for the beads 
by evaluating f(2,2,2,2,2,2) as follows. 

> fOD OD D4 DDS 


13 


Hence, as we saw in Example 9.5, there are 13 distinct 6-bead necklaces 
if 2 colors are available for the beads. Suppose these colors are B = blue 
and G = green. To see how many of these 13 distinct necklaces have 
prescribed numbers of blue and green beads, we compute the following 
pattern inventory. 

> simplify(f(B+G, B°2+G°2, B°3+G°3, B°4+G74, B75+G75, B76+G76)); 


B® + Gê + 3 B? G? +3 Bt G? +3 B? eae CL ee 


1 The output displayed for this command was produced by Maple V Release 5. Pre- 
vious releases of Maple yield output in which the variables are changed to y1, y2,..., y6. 
This has no effect on how f can be used. 
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Thus, for example, because the term 3 B+ G? appears in this pattern inven- 
tory, then there are three distinct 6-bead necklaces with four blue beads 
and two green beads. 


Now, suppose that the color R = red is also available for the beads. We 
can find the number of distinct 6-bead necklaces if three colors are available 
for the beads by evaluating f(3,3,3,3,3,3) as follows. 

> f(3, 3, 3, 3, 3, 3); 


92 


Hence, there are 92 distinct 6-bead necklaces if 3 colors are available for the 
beads. To see how these colors are distributed in the patterns, we compute 
the following pattern inventory. 

> simplify(f(B+G+R, B*2+G°2+R*2, B°3+G°3+R*3, B^4+G^4+R^4, 

> B°5+G*5+R°*5, B°6+G°6+R*6)); 

11 B? G? R?+6G R? B? + 6G? RB? +3G R Bt +6 BR? G?” 
+3 B RGŹ + 6 B? RG? +3 BG RÎ +6 BG? R? +6 B? G R? 
+ B° + G° +3 B? G? +3 BG? +3 B?°G* + B°G+BG 
+3 B? R? +3 G? R? +3 B4 R? +3 B? R*+3G*R?+3G? Rt 
+B°R+BR°+G’R+GR +R 


Thus, for example, because the term 6G R? B? appears in this pattern 
inventory, then there are six distinct 6-bead necklaces with one green bead, 
three red beads, and two blue beads. Also, as we would expect, note that 
each term in the 2-color pattern inventory is present in the 3-color pattern 
inventory. 


Finally, suppose that the color W = white is also available for the 
beads. We can find the number of distinct 6-bead necklaces if four colors 
are available for the beads by evaluating f(4,4, 4,4,4,4) as follows. 

> £(4, 4, 4, 4, 4, 4); 


430 


Hence, there are 430 distinct 6-bead necklaces if 4 colors are available for 
the beads. To see how these colors are distributed in the patterns, we 
compute the following pattern inventory. 

> simplify(f(B+G+R+W, B°2+G°2+R*2+W°2, B°3+G*3+R°3+W°3, 

> B°4+G*4+R°4+W°4, B°5+G°5+R°5+W°5, B°6+G*6+R*6+W6) ) ; 


11 B? G? R? +16 B G R? W? +16 B RG? W? +16 BW G? R? 

+16 G R B? W? +16 GW B? R? +16 RW B? G? 
+10B?GRW+10BGRW?+10BG RW+10BGR?W 
+6 G R? B? + 6G? R B? +3GRE +6BR?G?+3BRGt 
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+ 6 B? RG? +3 BG Rt +6 BG? R? +6 B? G R? + Bê + G° 

+ 3 B? G? + 3 B4 G? + 3 B? G4 + BŽ G + BGŤ +3 B? R? 

+3 GË R? +3 B4 R? +3 B? R^ +3 Gt R? +3G? R+ BŽR 

+ B R? + G R+ G RÝ + Rê +3 BW Gt + 6 B W° Q? 

+3 BW R*46G°W B?+3GW Bt+3GRW*4+ 6G R? W? 
+ 6 B? W G? + 6 G? RW? + 6 B W? R? +6 B? W R? 

+3 B RWŻ +6 B R? W? +3 B? W? +3 G? W? +3 R? W3 

+3 BW? +3 B? W4 +3GtW? +3G? W4 ESR Ww 

+3 R? W4 + W° +6 B? RW? +3 BGW* +6 BG Ww? 

+6 B? GW? +11 G? R? W? +11 B? R? W? +11 B? G? W? 
+6 GW? B?+6G? W RR+3GW Rt+6GW°R?+3RW Bt 
+6 R? W B? +6 RW? B? +3 RW G4 +6 R? WC? 

+6 RW? G? + BSW + BW3 + GÏ W + GW + RW + RW* 


Thus, for example, because the term 16 G R B? W? appears in this pattern 
inventory, then there are 16 distinct 6-bead necklaces with 1 green bead, 1 
red bead, 2 blue beads, and 2 white beads. 


9.6 Switching Functions 


In this section we show how the theory discussed in this chapter can be 
applied to the classification of switching functions. A switching function is 
a function f : Z > Z2. (More generally, a switching function is a process 
that can start with any number of inputs but has only two possible outputs. 
The preceding definition is sufficient for our purposes.) Switching theory 
was born in the first part of the 20t century due to the increasingly high 
volume of telephone calls being placed through local switchboards. The 
way switching functions were subsequently used by telephone companies 
led to their more recent use in the design of digital computers. It is in this 
area that switching functions as we have defined them (mapping from Z% 
to Z2) are most useful because of how computers store, send, and receive 
information as binary strings. 


Although we will not describe any specific applications of switching 
functions, we will mention that in general it is desirable to keep a record 
of all possible switching functions. However, this poses a problem because, 
even for small values of n, switching functions are very numerous. Specif- 
ically, for each positive integer n, since |Z}| = 2”, there are 2?” switching 
functions. Hence, even for a value of n as small as 5, there are more than 4 
billion switching functions. Because switching functions are so numerous, it 
would not be practical to keep a record of all of them. What is done instead 
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is that an equivalence relation is defined on the set of switching functions. 
This breaks the set of switching functions into equivalence classes, and then 
a record can be kept of just one function from each equivalence class. In 
order to define the equivalence relation that is used in general, fix a positive 
integer n and let X be the set of colorings of Z} by Z2. Then X is the set 
of switching functions for the fixed value of n. Note that the symmetric 
group Sn acts on X by 


Ri (Dig Poy etay Be) = f(r) Er(2) -< Bah) 


for f € X and m € Sn (see Written Exercise 11). Then for f,g € X, 
we define f ~ g if there exists 7 E€ S, such that mf = g (see Written 
Exercise 12). 


Example 9.7 Let n = 2 so that Z% = Z = {00,01,10,11}. Define f : 


II 


Zs E Z2 by 
f0,1) = 0 
flo) = 1 
hay = 0. 
Let m be the cycle (12) € S2. Then 7 f(x1, £2) = f(x2,21). Hence, 
mf(0,0) = 1 
rf(0,1) = 1 
mf(1,0) = 0 
rf(1,l) = 0. 
Thus, if we define g : Z2 — Zə by 
g(0,0) = 1 
g(0,1) = 1 
g(1,0) = 0 
g(1,1) = 0, 
then f~ g. E 


Since switching functions are recorded in general by keeping a record 
of one function from each equivalence class, it is of obvious importance to 
know the number of equivalence classes for each value of n. This is precisely 
where the theory discussed in this chapter applies. To count the number of 
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equivalence classes we can use a cycle index. And to count the number of 
equivalence classes in which the functions produce prescribed numbers of 
zeros and ones we can use a pattern inventory. We illustrate these ideas in 
the following example. 


Example 9.8 In this example we consider the switching functions 
f : Z2 > Zə with n = 3. We begin by noting that the elements in the 
symmetric group S3 can be expressed as the following cycles. 


TE S3 Cycle Representation 
Tı (1)(2)(3) 
T2 (12)(3) 
T3 (13)(2) 
T4 (1)(23) 
T5 (123) 
T6 (132) 


Next, we apply each of the permutations in S3 to the elements in the 
set Z3 = {000, 001,010,011, 100, 101,110,111}. For example, applying 72 
to 011 yields 101 since 72 flips the first and second entries and fixes the 
third. In the following table we list the results from applying each of the 
permutations in $3 to the elements in Z3. Also, in the first column of the 
following table we attach numerical labels to the elements in Z3. We will 
use these labels to express the actions of the elements in $3 on Z3 as cycles. 


Label Z3 Ti T2 T3 T4 T5 T6 
1 000 000 000 000 000 000 000 
2 001 001 001 100 010 100 010 
3 010 010 100 010 001 001 100 
4 011 011 101 110 011 101 110 
5 100 100 010 001 100 010 001 
6 101 101 011 101 110 110 011 
7 110 110 110 011 101 011 101 
8 111 111 111 111 111 111 111 


Now, in the following table we list the actions of each of the permutations 
in S3 on the labels of the elements in Z3. For example, the action of 72 is 
(1)(2)(35)(46)(7)(8) since, using the labels in the preceding table, 7 fixes 
elements 1, 2, 7, and 8; sends elements 3 and 5 to each other; and sends 
elements 4 and 6 to each other. Also, in the third column of the following 
table we list the monomials for the resulting cycle index. 
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TE S3 Action on Z3 Monomial 


T (1)(2)(8)(4)(5)(6)(7)(8) (x1)* 

T2 (1)(2)(35)(46)(7)(8) (1)*(a2)? 
T3 (1)(25) (3) (47) (8) (1)*(a2)? 
Ta (1)(23)(4)(5)(67)(8) (21)*(a2)? 
T5 (1) (253) (467) (8) (£1)? (x3)? 
T6 (1) (235)(476)(8) (21)? (x3)? 


Thus, the cycle index for this example is 


f(v1,%2,%3) = = (x} + 3xÍx} + 2x713). 


alr 


Since there are two colors in this example (the numbers zero and one), then 
the total number of equivalence classes of switching functions with n = 3 
is given by f(2,2,2) = (256 + 192 + 32) = 80. To see how many of these 
equivalence classes contain functions that produce prescribed numbers of 
zeros and ones, we compute the following pattern inventory. Denote the 
colors by A = zero and B = one. Then the pattern inventory is 


f(A+ B, A? + B?, A? + B?) 


((A+ B) + 3(A + B)*(A? + B?)? + 2(A + B)?(A? + B3)?) 


Ql = 


= A8 +4A"B +9A°B? + 164°B? + 2044B4 + 1643 B3 + 9A? B® 
+4AB™+ B8. 


Hence, for example, because the term 16A°B? appears in this pattern in- 
ventory, then there are 16 equivalence classes that contain functions that 
produce 5 zeros and 3 ones. o] 


9.7 Switching Functions with Maple 


In this section we show how Maple can be used to count and classify equiv- 
alence classes of switching functions. We demonstrate using the results 
obtained in Example 9.8. 


To construct the cycle index for a set of switching functions, we have 
provided the user-written procedure switch, for which code is given in 
Appendix C.4. Because switch calls and uses the user-written procedure 
ppoly, for which code is also given in Appendix C.4, both of the procedures 
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switch and ppoly must be saved as text files in the directory from which 
we are running Maple. If they are saved as the text files switch and ppoly, 
then we can include the switch procedure in this Maple session by entering 
the following command. 


> read switch; 


We can then construct the cycle index for the set of switching functions 
with n = 3 by entering the following command.” 


> f := switch(3, x, ’maxsub’); 
1 1 1 
8 4 2 2: 2 
w= S21 bea ao tS a 
P= 2 3 3 


Note that the result is the cycle index in Example 9.8. In the preceding 
command, the first parameter specifies the value of n. The second pa- 
rameter is the variable in whose terms the resulting cycle index is to be 
expressed. The third parameter is a variable defined by the command as 
the value of the largest subscript on the variables in the cycle index. (This 
would be important for larger values of n.) 


Next, we convert the preceding cycle index into a function that we can 
evaluate in the usual manner. Note that in this command we include input 
parameters x[1] through x[3] because 3 is the largest subscript on the 
variables in the cycle index.) 

> f := unapply (f, x[1], x[2], x[3]); 


1 1 1 
f := (2-1, 1-2, £3) > r18 + z1r 2? + : r1? r3? 


Since there are two colors, we can find the total number of equivalence 
classes of switching functions by evaluating f(2,2,2) as follows. 
> £(2, 2, 2); 


80 
To classify these equivalence classes, we compute the following pattern in- 
ventory. Denote the colors by A = zero and B = one. Then we can compute 
the pattern inventory by entering the following command. 
> simplify(f(A+B, A*2+B°2, A°3+B73)); 
4B’ A +9 B® A? +16 B? A? + 20 B* A* + 16 B® A> + 9 B? A® 
+4BA"+ AP + BS 


Note that the result is the pattern inventory in Example 9.8. 


2Due to the tremendous number of switching functions for even moderately sized 
values of n, this routine, depending on your machine speed, can be very time-consuming 
for n > 10. 
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Written Exercises 


1. Suppose you wish to construct a necklace with three colored beads, 
and each bead can be either red or white. Assume that the beads can 
be rotated around the necklace, and that the necklace can be flipped 
over and worn. Consider the following general shape for the necklace 
with one bead positioned at each corner. 


1 


2 3 


Let G be the set of rigid motions of a triangle, let S be the set 
of vertices of the preceding general figure, and let X be the set of 
colorings of S by the colors R = red and W = white. 


(a) For each 7 € G, find Fix(z). 
(b) For each x € X, find Stab(x). 


(c) Find the cycle index of G acting on S. Use this cycle index to 
determine the number of distinct necklaces you can construct. 


(d) Find the pattern inventory of X. 


(e) Suppose each bead can also be B = blue. Determine the num- 
ber of distinct necklaces you can construct with this additional 
color available. Also, find the new pattern inventory. Accord- 
ing to this new pattern inventory, how many of the new distinct 
necklaces have at least one blue bead? 


2. How many distinct necklaces can you construct with three beads if 
ten colors are available for each bead? Assume that the beads can 
be rotated around the necklace, and that the necklace can be flipped 
over and worn. 


3. Suppose you wish to construct a necklace with five colored beads, 
and each bead can be either red or white. Assume that the beads can 
be rotated around the necklace, and that the necklace can be flipped 
over and worn. 


(a) How many distinct necklaces can you construct? 


(b) How many of the distinct necklaces in part (a) have two red 
beads and three white beads? 


(c) How many of the distinct necklaces in part (a) have at least three 
white beads? 
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4. Suppose you wish to construct a building in the shape of a pentagon, 
and you will paint each side of the building one of ten different colors. 
Assume two buildings are equivalent if one could be rotated or flipped 
to look like the other. How many nonequivalent buildings can you 
construct? 


5. Suppose you wish to construct a six-pointed star, and you will paint 
each point on the star either blue, green, or red. Assume two stars 
are equivalent if one could be rotated to look like the other. (Rotated 
only, not flipped!) 


(a) How many nonequivalent stars can you construct? 


(b) How many of the nonequivalent stars in part (a) have each of 
the three colors used on exactly two of the points? 


(c) How many of the nonequivalent stars in part (a) have each of 
the three colors used on at least one of the points? 


6. Repeat Written Exercise 5 if you assume two stars are equivalent if 
one could be rotated or flipped to look like the other. 


7. Find the number of equivalence classes of switching functions 
with n = 4. 


8. Prove Lemma 9.1. 
9. Prove Lemma 9.2. 
10. Prove Theorem 9.5. 


11. Let X be the set of switching functions for a fixed positive integer n. 
Show that the symmetric group Sn acts on X by 


Tf (x1, 22, is Tn) = (£r) Tr(2); ee iXx(n)) 


for f € X and T € Sy. 
12. Let X be the set of switching functions for a fixed positive integer n. 
For f,g € X, define f ~ g if there exists 7 E€ Sh such that tf = g 


using the action of S, on X defined in Written Exercise 11. Show 
that ~ is an equivalence relation. 
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Maple Exercises 


1. Suppose you wish to construct a necklace with five colored beads, 
and each bead can be either red, white, blue, or yellow. Assume that 
the beads can be rotated around the necklace, and that the necklace 
can be flipped over and worn. Let G be the set of rigid motions of a 
pentagon, let S be the set of vertices of a pentagon, and let X be the 
set of colorings of S by the colors R = red, W = white, B = blue, 
and Y = yellow. Find the pattern inventory of X. Then use this 
pattern inventory to determine the number of distinct necklaces you 
can construct with exactly two red beads, one white bead, one blue 
bead, and one yellow bead. 


2. Find the number of equivalence classes of switching functions with 
n = 5. Also, determine how many of these equivalence classes contain 
functions that produce 30 zeros and 2 ones. 
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Appendix A 


Basic Maple Tutorial 


The purpose of this appendix is to introduce some basic commands, syntax, 
and programming concepts for Maple V Release 5. For a more thorough 
introduction to Maple, see [5] and [16]. 


A.1 Introduction to Maple 


Maple is an advanced software tool designed for doing complicated math- 
ematics quickly and precisely on a computer. To use Maple, you enter 
commands at “prompts” that can be identified as the following symbol 
that appears on the left in the body of a Maple worksheet. 


> 


When you access Maple, a Maple window will open that contains a prompt 
at which you can immediately begin performing mathematical operations. 
For example, you can use Maple to multiply the numbers 247 and 3756 by 
entering “247 * 3756;” as follows. 

> 247 * 3756; 


927732 


When you enter a Maple command (assuming no syntax errors), Maple 
will perform the calculation and move the cursor to the next command line 
in the worksheet (which it will create if no subsequent command line exists). 
Each Maple command must end with either a semicolon or a colon. If you 
end a Maple command with a semicolon, Maple will display the result. If 
you use a colon, Maple will suppress the result. For example, if you enter 
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the preceding command with a colon instead of a semicolon, Maple will 
respond as follows. 


> 247 * 3756: 


Despite the fact that no result is displayed after this command, the cal- 
culation was performed. Suppression of output is useful in many of the 
applications we discuss in this book. 


If you do not include a semicolon or colon at the end of a Maple 
command, Maple will not perform the calculation and will respond with 
a warning message after a new prompt. For example, if you enter the 
preceding command without a semicolon or colon, Maple will respond with 
a warning message similar to the following. 


> 247 * 3756 


> 
Warning, incomplete statement or missing semicolon 


One of the great benefits of a computer package like Maple is that if you 
make a mistake in entering a command, you can go back to the command 
and correct the error. For example, you can remedy the preceding warning 
message by returning to the preceding command and entering it again with 
a semicolon or colon. If you enter the preceding command with a semicolon 
at the end of the command line, Maple will respond as follows. 


> 247 * 3756; 


> 


927732 


A.2 Arithmetic 


Maple is an example of a computer algebra system. One feature of such a 
system is that it can be used as a very smart calculator. In particular, you 
can very easily use Maple to add, subtract, multiply, or divide numbers 
or algebraic expressions. These arithmetic operations can be performed in 
Maple by using the following symbols: + for addition, - for subtraction, * 
for multiplication, and / for division. Also, the operation of exponentiation 
can be performed in Maple by using the symbols ^ or **. As examples, two 
numbers or fractions can be added in Maple as follows. 


> 253 + 7775; 
8028 
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> 25/27 + 3/51; 
452 
459 
Operations can be performed in Maple on the last entered result by using 


the percent symbol %.! For example, by entering the following command, 
we multiply the preceding result by 23. 


> 23 * hs 
10396 
459 
Two percent symbols listed together refer to the next-to-last entered result. 
> 23 * Th; 
10396 
459 


And we can evaluate 37 by entering either of the following commands. 
ae eS 
2187 
> 3**7; 


2187 


Like other computer algebra systems, Maple uses exact arithmetic. 
Thus, if we divide two integers, Maple will return the exact answer as 
follows. 


> 3235/7478; 


3235 
7478 


The Maple evalf command can be used to obtain the decimal representa- 
tion of a number. For example, the following command returns the decimal 
representation of the preceding number. The default number of digits dis- 
played is ten. 


> evalf(%); 
.4326023001 


1Maple V Release 5 is the first release of Maple that uses the percent symbol % to 
refer to the last entered result. Earlier releases of Maple use ditto marks " for this 
purpose. 
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To obtain more or fewer than ten digits, the desired number of digits must 
be specified. For example, the following command yields the first 20 digits 
in the decimal representation of the preceding fraction. 

> evalf(4%, 20); 


.43260230008023535705 


As a final note in this section, we mention the fact that Maple will 
recognize only parentheses to enclose groups of objects with the basic arith- 
metic operations. Maple will not recognize square brackets or curly braces 
for this purpose. We illustrate this in the next two commands. 


> 5° [15*(3+2)]; 
Error, non algebraic terms in power should be of the same type 


> 57 (15*(3+2)); 
26469779601696885595885078146238811314105987548828125 


A.3 Defining Variables and Functions 


To assign a numerical value or expression to a variable in Maple, you must 
use the colon-equal := notation. For example, the following command as- 
signs the value 5 to the variable y. 

> y := 5; 

y:=5 

The variable y will then have this value throughout the current Maple ses- 
sion until y is assigned another value or its value is “unassigned”. To display 
the contents of this variable, we must only enter the following command. 

> y; 

5 

We can directly perform mathematical operations using the assigned vari- 


able y as illustrated in the next command. 
> 4*y + 5; 


25 


The following command can be used to “unassign” the value of y. Note 
that we use back ticks ° in this command. 


ey rey’; 
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There are two ways to define functions in Maple. The most useful way 
is to use the minus-greater than -> notation. For example, the following 
command defines the function f(x) = x. 

ef f=] > es 

f:=r > r? 
The reason this is the most useful way to define a function in Maple is 
because it allows standard functional notation like f(5) to be used when 
evaluating the function at a particular value. 

> f(5); 


25 


Functions can also be defined in Maple as expressions without using 
the -> notation. For example, the following command defines f(x) = x? as 
an expression. 

> £1= x2: 


f:=2? 


The reason this method for defining a function in Maple is not as useful is 
because evaluating the function at a particular value then requires use of 
the Maple subs command. For example, to evaluate f(5) we must enter 
the following command. 

> subs(x=5, f); 


25 


For a function defined as an expression, standard functional notation is not 
understood by Maple and results in nonsense. 
> f(x); 


> £(5); 


A.4 Algebra 


Another benefit of Maple is that it allows entire algebraic expressions to 
be manipulated in the same way calculators manipulate numbers. Some 
important Maple commands for performing operations with algebraic ex- 
pressions are simplify to simplify an expression, expand to expand an 
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expression, factor to factor an expression, and solve to solve an equation 
or system of equations. Examples of these commands follow. 
> simplify( (x*3+1)/(x*2-x+1) ); 


r+ 
> expand( (x72+1)*(x+1)*(x+3) ); 
r +4r? 4497 +4243 
> factor (4%); 
(2? +1)(2+1)(2+3) 
> sol := solve( x*3-9*x72+20*x=0 , x ); 
sol := 0,4,5 


The output for the preceding solve command is returned by Maple as a 
set whose elements can be chosen. For example, we can choose the second 
element in sol by entering the following command. 

> sol[2]; 


A.5 Case Sensitivity 


Maple is case sensitive — it distinguishes between upper and lower case 
characters in commands. For example, to factor the polynomial x? — 2x —3, 
we can enter the following command. 


> factor (x*2-2*x-3) ; 
(x+1)(a-3) 


However, the next command does not yield the result. 
> FACTOR (x*2-2*x-3) ; 


FACTOR( 2? — 22-3) 


Maple has several functions designed for doing modular arithmetic. For 
many of these functions, the name of the function is the same as for the 
nonmodular arithmetic function but with an upper case first letter. For 
example, to factor the polynomial z? — 22 — 3 over the integers modulo 3, 
we can enter the following command. 
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> Factor (x*2-2*x-3) mod 3; 
u(at+1) 


As another example, consider the Maple irreduc function, which re- 
turns true if a polynomial input is irreducible over the integers and false 
if not. The following command indicates, as expected, that x? + 1 is irre- 
ducible over the integers. 


> irreduc(x*2+1); 


true 


However, the next command states that x? + 1 is not irreducible over the 
integers modulo 2. 


> Irreduc(x*2+1) mod 2; 


false 


We can see how x? + 1 factors over the integers modulo 2 by entering the 
following Factor command. 
> Factor(x*2+1) mod 2; 


(a+1)? 


A.6 Help File 


If you ever need to see information or an example regarding a particular 
Maple command, you can gain access to a help window for the command by 
entering the command name preceded by a question mark (and not followed 
by a semicolon). For example, the following command causes Maple to 
display a help window for the factor command. 


> ? factor 


A.7 Arrays and Loops 


Arrays in Maple are data structures in which the elements are grouped 

sequentially. To create an array in Maple, we can use the array function. 

For example, the following command creates an array with four elements. 
>a := array([5, 1, -4, 6]); 


a:=[5, 1, —4, 6] 
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Maple associates with each element in an array an integer that can be used 
to access the element. To access an element in an array, we enter the name 
of the array with the position of the element we wish to access in square 
brackets. For example, the following command returns the third element 
in the preceding array a. 


> al3]; 
—4 


In this book we often create arrays that we use as vectors. In Maple, 
the vector routine, which is part of the linalg linear algebra package (a 
more detailed discussion of this package is given in Appendix B), can be 
used to create vectors. Vectors function in Maple essentially the same way 
as arrays, except that the integers Maple associates with the elements in a 
vector always start with an integer index of 1, whereas arrays can have any 
integer index. To illustrate the vector command, we first include the Maple 
linalg package in this Maple session by entering the following command. 

> with(linalg): 


Then in the next command we create a vector with four elements. 
> b := vector([1, 1, 3, -4]); 


b:=[1, 1, 3, —4] 


The following command allows us to access the first element in b. 
> b[1]; 


1 


In the next command we create an empty array with storage for four ele- 
ments. 


> c := array(1..4); 
c := array(1..4,[ ]) 


In Maple, loops are designed to repeat a specific command a specified 
number of times. The most basic type of loop in Maple is a for loop. In 
the following commands we enter a for loop in which we sequentially access 
the elements in a and b, multiply each of the corresponding elements, and 
store the results in c. 

> for i from 1 to 4 do 

> c[i] := a[i]»*b[i]; 


> od: 


In this loop, the indexing element i starts at 1 and increases by 1 with 
each passage through the loop. The loop terminates when i reaches the 
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upper index 4. Every for loop in Maple ends with a matching od statement 
(the reverse of the letters in the do statement). We use a colon after the 
od statement to prevent Maple from printing the intermediate calculations 
during the progression of the loop. 


The Maple evalm command can be used to display the contents of a 
vector. For example, to display the vector c constructed in the preceding 
for loop, we enter the following command. 


> evalm(c); 
[5, 1, —12, —24] 


Another useful type of loop in Maple is a while loop, which executes 
the commands inside the loop until a specified condition fails. For example, 
the following while loop constructs the same array c as the preceding for 
loop. 


> i := 0: 

> while i < 4 do 

> i:s=i+41; 

> cli] := alil*b[i]; 
> od: 


This while loop executes the two commands inside the loop, each time 
incrementing i by 1, and terminates when i reaches 4. While loops also 
end with od statements. The next command shows that this while loop 
constructs the same array c as the preceding for loop. 


> evalm(c); 


[5, 1, —12, —24] 


A.8 Conditional Statements 


Conditional statements in Maple are designed to decide which provided 
commands to execute based on whether a provided statement is true or 
false. To demonstrate, we define the following numbers x and y. 


> x := 13034021/29391911; 
13034021 
T “99391911 
> y := 2483118283/4630112000; 
2483118283 
Y :— 7630112000 
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The Maple if statement can be used to perform conditional statements. 
When combined with the else statement, it performs a double alternative 
step. For example, suppose we wish to determine which of x and y is larger. 
To do this, we can enter the following commands. 


> if x > y then 

> print (x); 

> else 

> print(y); 

> fi: 
2483118283 
4630112000 


The if statement in the preceding commands checks if x is greater than 
y. Since this is false, Maple executes the print command after the else 
statement. Maple if statements end with fi statements (the reverse of the 
letters in the if statement). 


To produce multiple decision statements in Maple, we can use the if 
statement in combination with the elif and else statements. For example, 
the following Maple for loop compares the corresponding elements in the 
array a and vector b defined in Appendix A.7, and multiplies or divides the 
elements depending on which one is larger. Maple then stores the results 
in the array c. If the elements in a and b are equal, then the corresponding 
element in c is defined to be 0. 


> for i from 1 to 4 do 


> if a[i] > b[i] then 
> cli] := a[i]*b[i]: 
> elif ali] < b[i] then 
> cli] := alil/blil: 
> else 

> cli] := 0: 

> fi: 

> od: 


We can see the resulting array c by entering the following command. 


> evalm(c); 
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A.9 Maple Procedures 


A Maple procedure is a prearranged collection of commands that Maple 
executes together. To define a procedure in Maple we use the proc state- 
ment. In order to illustrate the syntax for writing a Maple procedure we 
construct the following procedure dprod, which is designed to compute the 
dot product of two vectors. 


> dprod := proc(vi, v2) 


> local res, n, i; 

> res := 0: 

> n := linalg[vectdim] (v1) ; 

> for i from 1 to n do 

> res := res + vili]*v2[i]: 
> od: 

> RETURN (res) : 

> end: 


In this procedure, v1 and v2 are input vectors. Note that the proc state- 
ment is not ended with a semicolon. The local statement that appears at 
the start of this procedure defines variables whose values are used only in 
the procedure itself. After the procedure terminates, these variables will 
return to their assigned values, if any, from before the procedure was exe- 
cuted. The RETURN statement that appears at the end of this procedure 
specifies the value to be returned by the procedure to the calling program. 
If this statement is not included in the procedure, then the procedure will 
return its last computed result. To specify the end of the procedure, we use 
an end statement. And we use a colon after the end statement to prevent 
Maple from printing the commands and statements in the procedure after 
the procedure is entered or read in as text. 


In the following commands, we define two vectors and demonstrate 
how the dprod procedure can be used to compute their dot product. 


> vect1 := vector([1, 3, 5, 3, 6]); 
vect1 := [1, 3, 5, 3, 6] 
> vect2 := vector([7, 2, 1, 0, -1]); 
vect2 := [7, 2,1, 0, -1] 


> dprod(vect1, vect2); 


12 
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We mention one final note regarding how user-written procedures can 
be included in Maple sessions. First, the statements in any procedure can 
always be entered interactively, line by line in a Maple session. However, 
if the procedure is very long (for example, see the procedures in Appendix 
C.4), this may not be practical. Another way to include a user-written 
procedure in a Maple session is by saving the text of the procedure as a 
text file, and reading the text file into a Maple session using a Maple read 
command. To do this in a UNIX-like environment, the procedure must 
be saved as a text file in the same directory in which the Maple program 
is running. For other operating systems, the proper location of the text 
file varies. Assuming we are working in a UNIX-like environment, if we 
have saved the text of the dprod procedure shown above as the text file 
dprod, then we can include the procedure in a Maple session by entering 
the following command. 

> read(dprod) ; 


The procedure can then be used as illustrated above. 
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Appendix B 


Some Maple Linear 
Algebra Commands 


Most of the Maple functions that deal with vector and matrix computa- 
tions are part of the linalg linear algebra package. In this appendix, we 
give a brief introduction to some of these functions. We begin by entering 
the following command, which includes the linalg package in this Maple 
session. 


> with (linalg): 


By entering the preceding command, we gain access to all of the routines in 
the linalg package. Note that we used a colon at the end of this command. 
This suppresses the list of available liner algebra routines that would have 
been displayed had we used a semicolon. If you wish to see a list of the 
available routines, just enter the preceding command with a semicolon (or 
look at the help file on the linalg package — see Appendix A.6). 


There are several ways to enter a matrix in Maple. In the following 
command, we use the Maple matrix function to define a 3 x 3 matrix A. 
The first two parameters in this command are the dimensions of the result. 
The remaining parameters enclosed in brackets are the entries in the matrix 
listed by consecutive rows. 


> A := matrix(3, 3, [2, 5, 7, 3, 1, 7, 8, 1, 2]); 
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Next, we again use the matrix function to define the matrix A, but 
this time we use slightly different syntax. In this command, we use double 
brackets within the matrix function. The outside brackets again contain 
the entries in the matrix listed by consecutive rows, but now each specific 
row is set off by another set of square brackets. This syntax does not require 
that we specify the dimensions of the result. 


> A := matrix([[2, 5, 7], [8, 1, 7], [8, 1, 2]]); 


2 5 
A:= 13 1 
8 1 


YNNN 


In Maple, when a variable has been assigned as a number of an expres- 
sion, it is usually possible to see this stored value or expression by entering 
the name of the variable. For example, note the following commands. 


However, if a variable has been assigned as a matrix, then entering the 
name of the variable will not cause Maple to display the matrix. Instead, 
Maple will print only the name of the matrix. For example, if we try to 
view the matrix constructed above by entering the name of the variable A 
in which the matrix is stored, Maple will respond as follows. 

> A; 


A 


To view the matrix A constructed above, we must use the Maple evalm 
function as follows. 
> evalm(A); 


cow N 
eR oO 


7 
7 
2 


In the next command, we multiply each of the entries in A by the 
scalar 4. Again, Maple responds with only the name of the result. 


> 4*A; 


4A 
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But we can again use the Maple evalm function to see the result. 


> evalm(4*A) ; 


8 20 28 
12 4 28 
32 4 8 


Next, we enter the following 3 x 3 matrix B. 
> B := matrix([[1, Q; 3], [-3, 5, 1J, [2, 4, 1]1); 


1 0 3 
B:=|-3 5 1 
2 4 1 


To add matrices in Maple, we can use the usual + symbol. 


> evalm(A + B); 


3 5 10 
0 6 8 
10 5 3 


And to raise matrices to powers in Maple, we can use the usual ^ symbol. 


> evalm(A*2); 


75 22 63 
65 23 42 
35 43 67 


However, Maple distinguishes between scalar and matrix multiplication 
with different symbols. To multiply matrices in Maple, we must use the &« 
command rather than the ususal x symbol. 


> evalm(A &* B); 


1 53 18 
14 33 17 
9 13 27 
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The preceding operations of matrix addition, exponentiation, and multipli- 
cation can be combined when evaluating expressions in Maple. For example, 
for the matrices A and B constructed above, we can evaluate A? — 4BA by 
entering the following command. 


> evalm(A73 - 4*B &* A); 


616 428 753 
467 426 636 
639 225 504 


To enter vectors in Maple, we can use the vector function. For example, 
in the following command, we define a vector c of length 3 positions. 


> c := vector([1,4,2]); 


C= [1, 4, 2] 


For the matrix A and vector c deffined above, we can use the Maple linsolve 
command as follows to solve the equation Ax = c. 


> x := linsolve(A, c); 
O [39 144 121 
205° 205° 205 


Also, the following command yields the inverse of A. Note that to obtain 
the inverse of A, we must only raise A to the power —1. 


T: 


> invA := evalm(A*(-1)); 


-1 -3 28 
41 205 205 
invA := 10: 532 f 
41 205 205 
—1 38 -13 
41 205 205 


Since A is invertible, then we can also solve the equation Ax = c by forming 
x = A` tc as follows. 
> x := evalm(invA &* c); 
39 -144 121 


g£ := 


205° 205’ 205 


We close this appendix by mentioning how some special types of matri- 
ces can easily be defined in Maple. For example, in the following command, 
we define the 3 x 3 zero matrix. 
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> matO := matrix(3, 3, 0); 


And in the following command, we define a 4 x 4 matrix containing all ones. 


> mati := matrix(4, 4, 1); 


mati := 


Be eH 
eee 


1 
1 
1 
1 


Be ee 


Finally, in the following command, we use the Maple diag function to define 
the 5 x 5 identity matrix. 
> id := diag(1, 1, 1, 1, 1); 


1 0 0 0 0 
0 10 0 0 
id:=|0 0 1 0 0 
000 1 0 
000 0 1 


In general, the diag function yields a diagonal matrix with diagonal entries 
given in order by the parameters in the command. 


© 1999 by CRC Press LLC 


Appendix C 


User-Written Maple 
Procedures 


C.1 Chapter 5 Procedures 


rscoeff := proc(f, x, p, a) 
local g, i, j, ng, cg, fs, field, ftable; 
fs := 2° (degree(p)); 
field := linalg[vector] (fs); 
for i from 1 to fs-1 do 
field[i] := Powmod(a, i, p, a) mod 2: 
od: 
field[fs] := 0; 
ftable := table(); 
for i from 1 to fs-1 do 
ftable[ field[i] ] := a^i: 
od: 
ftable[ field[fs] ] := 0; 
g := expand(f) mod 2; 
ng := 0; 
for j from 0 to degree(g,x) do 
cg := coeff(g, x, j): 
cg := ftable[ Rem(numer(cg), p, a) mod 2 ] 
/ ftable[ Rem(denom(cg), p, a) mod 2 ]; 
if degree(cg,a) < 0 then 
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cg := cg * a*(fs-1); 

fi: 

if degree(cg,a) = (fs-1) then 
cg := cg/a*(fs-1); 


ng := ng + cg*x^j: 
od: 
g := sort(ng mod 2, x); 
RETURN (g); 
end: 


binmess := proc(cw, n, p, a, ml) 
local i, j, bvect, vs, pco, dga, binmat, binvect; 
for i from 0 to ml do 
pco coeff(cw, x, i): 
if pco <> 0 then 
dga := degree(pco, a): 
pco := Powmod(a, dga, p, a) mod 2: 
fi: 
vs := []: 
for j from 0 to n-1 do 
vs := [op(vs), coeff(pco, a, j)]: 


od: 
if i= 0 then 
binmat := linalg[matrix](1, n, vs): 
else 
binmat := linalg[stackmatrix] (binmat, vs): 
fi: 
od: 
binvect := convert(binmat, vector); 
RETURN (evalm(binvect)) ; 
end: 


bincoeff := proc(n, bmess) 
local i, j, k, bk, pcoeff, poly; 
peoeff := []: 
bk := linalg[vectdim] (bmess) ; 
i 


:= 0; 
k := 0; 
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while i < bk do 
poly := 0: 
for j from 1 to n do 
poly := poly + bmess[i+j]*a*(j-1): 


od: 
pcoeff := [op(pcoeff), poly]: 
k := kt+1; 
i := k*n; 
od: 
RETURN (evalm(pcoeff)): 
end: 


rseuclid := proc(t, f, g, z, p, a) 
local q, r, rm1, rpi, uml, u, upi, vmi, v, vpi, i; 
rmi := sort(Expand(f) mod 2); 
r := sort(Expand(g) mod 2); 


umi := 1; 

u := 0; 

vmi := 0; 

v := 1; 
read(rscoeff); 


while degree(r,z) >= t do 
rpi := Rem(rmi, r, z, ’q’) mod 2; 
rpi := rscoeff(rp1, z, p, a); 
q := rscoeff(q, z, p, a); 
vp1 := expand(vm1l - v*q) mod 2; 


vmi := v; 
v := sort(vpl, z); 
v := rscoeff(v, Zz, p, a); 
up1 := expand(umi - u*q) mod 2; 
umi := u; 
u := sort (up1); 
u := rscoeff(u, z, p, a); 
rmi := r; 
r := sort(rpl, z); 
print(‘Q = ‘, q, ‘ R=‘, Yr, 
¢ V=‘, v,‘ U= ‘, u); 
od; 
print (); 
RETURN(q, r, v, u): 
end: 
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C.2 Chapter 7 Procedures 


Note: The following two procedures are variations of procedures found in 
the examples folder of the Maple V Release 3 student version (see [27]) 
produced by Waterloo Maple Inc. and the University of Waterloo. 


to_number := proc(mess) 

local sl, cn, sn, ii, ntable; 

ntable := table([’a’=0, ’b’=1, ’c’=2, ’d’=3, ’e’=4, 
£=5, ?g?=6, *h’=7, °i?=8, ?j?=9, °k’=10, 
?1V7=11, ?m’=12, °n’=13, ?0’=14, ’p’=15, 
’q?’=16, ?r’=17, °’s’=18, ?t’=19, ’?u’=20, 
’y?=21, ?w’=22, ?x’=23, ’y’=24, ?z’=25]): 

sl := length(mess) ; 


cn := 0; 
for ii from 1 to sl do 
sn := ntable[substring(mess, ii..ii)]: 
cn := 100*cn + sn: 
od: 
RETURN (cn): 
end: 
to_letter := proc(num) 
local cs, cn, sl, a, b, c, d, e, f, g, h, i, j, k, 
1, m, n, 0, p, gq, r, Ss, t, U, V, W, X, y, Z, 
ltable, ans; 
ltable := table([0=a, 1=b, 2=c, 3=d, 4=e, 5=f, 6=g, 
7=h, 8=i1, 9=j, 10=k, 11=1, 12=m, 
13=n, 14=0, 15=p, 16=q, 17=r, 18=s, 
19=t, 20=u, 21=v, 22=w, 23=x, 24=y, 
25=z]); 
cn := num; 
sl := floor(trunc(evalf(log10(cn)))/2) + 1: 
ans := ‘°; 
for i from 1 to sl do 
cn := cn/100; 
cs := ltable[frac(cn)*100] ; 
ans := cat(cs, ans); 
cn := trunc(cn); 
od: 
RETURN (ans) ; 
end: 
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C.3 Chapter 8 Procedures 


epoints := proc(ec, x, ub, p) 
local ecurve, z, pct, k, i; 
pet := 0; 
for k from 0 to p-1 while pct <= ub do 
Z := subs(x=k, ec) mod p; 
if z = 0 then 
pct := pet + 1; 
ecurve[pct] := [k, z]; 
fi: 
if z & ((p-1)/2) mod p = 1 then 
z := z & ((p+1)/4) mod p; 


ecurve[pct+1] := [k, z]; 
ecurve[pct+2] := [k, -z mod pl; 
pct := pct + 2; 
fi: 
od: 
if pct > ub then 
pct := ub: 
fi: 


seq(ecurve[i], i = 1..pct): 
end: 


addec := proc(le, re, c, p) 
local i, cle, cre, lambda, res, x3, y3; 
cle := le mod p; 


cre := re mod p; 
if cle = 0 or cre = 0 then 
res := cle + cre; 
elif cle[1] = cre[1] and cle[2] = -cre[2] mod p then 
res := 0; 
else 
if cle[1] = cre[1] mod p and cle[2] = cre[2] mod p 
then 
lambda := ((3*cle[1]*2+c)/2/cle[2]) mod p; 
else 
lambda := (cre[2]-cle[2])/(cre[1]-cle[1]) mod p; 
fi: 
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x3 := (lambda*2-cle[1]-cre[1]) mod p; 
y3 := (lambda*(cle[1]-x3)-cle[2]) mod p; 
res := [x3, y3]; 
fi: 
res; 
end: 


elgamal := proc(alpha, e, c, p) 
local calpha, n, y; 


read(addec) ; 
calpha := alpha; 
n := e; 
y := 0; 
while n > 0 do 
if irem(n, 2, ’n’) = 1 then 
y := addec(calpha, y, c, p): 
fi: 
calpha := addec(calpha, calpha, c, p): 
od: 
y; 
end: 


C.4 Chapter 9 Procedures 


switch := proc(n, x, maxsub) 
local vs, i, j, k, pg, bk, nsw, pe, bki, pn, allpoly, 
mon, nlist, dg, vres, colist, pnum, part, pgel, 
jnum, vt, pct, multiplicity, m; 


vs := linalg[vector](n, 0); 
vt := linalg[vector](n, 0); 
nsw := 27n; 
read(ppoly) ; 


multiplicity := proc(y, j) 
jly] := jly] + 1; 

end: 

allpoly := 0; 

nlist := {}; 

pe i= 0: 
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colist := []: 
for pnum from 1 to combinat [numbpart] (n) do 
for i from 1 to n do 
j[i] := 0: 
od: 
if pnum = 1 then 
part := combinat[firstpart] (n); 
else 
part := combinat [nextpart] (part); 
fi: 
map(multiplicity, part, ’j’): 
pgel := (1; 
pct := 0; 
for i from 2 to n do 
for jnum from 1 to j[i] do 
pgel := Lop(pgel), 
[seq(pct + (jnum-1)*i + k, k = 1..1)]]; 
od: 
pet := pct + ixjlil; 


od: 
pg := [op(pg), pgell; 
colist := [op(colist), 
product (1/@k "7k ej 0k!) °K? = 1..n)]; 
od: 
m := 1; 
for i from 1 to nops(pg) do 
pe := pglil; 
nlist := {}; 
mon := 1; 
dg := 0; 


for j from 0 to nsw-1 do 
bk := convert(j, base, 2); 
bki := linalg[vectdim] (bk); 
for k from 1 to n do 


vs[k] := 0; 
od; 
for k from 1 to bki do 
vs[k] := bk[k]; 
od: 


for k from 1 to linalg[vectdim] (vs) do 
vt [linalg[vectdim] (vs)-k+1] := vs[k]; 

od: 

vres := ppoly(pe, vs, n, x, nlist, m); 
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pn := vres([1]; 


nlist := nlist union vres([2]; 
dg := dg + vres[3]; 
m := vres([4]; 
mon := simplify(mon*pn) ; 
od: 
mon := colist[i] * mon * x[1]*(2*n-dg); 
allpoly := simplify(allpoly + mon); 
od: 
maxsub := m; 
RETURN (allpoly) ; 


end: 


ppoly := proc(pe, vb, n, x, nlist, max) 
local i, j, dcycle, clen, ob10, nb10, res, cyct, vs, 
vc, plist, k, dg, nsum, tmp, m, ct, tmax; 
vs := []; 


vc i= []; 
plist := {}; 
tmax := max; 


for i from 1 to n do 
vs := [vb[i], op(vs)]; 
ve := [vb[i], op(vc)]; 


od: 
res := 1; 
dg := 0; 
cyct := 0; 
if linalg[vectdim] (pe) = 0 then 
res := res * x[1]; 
dg := dg + 1; 
fi: 
if linalg[vectdim] (pe) <> 0 then 
ob10 := convert ([seq(vs[linalg[vectdim] (vs) - ct + 1], 


ct = 1 .. linalg[vectdim] (vs))], base, 2, 10); 

if linalg[vectdim] (ob10) > 1 then 

m := linalg[vectdim] (ob10) ; 

nsum := 0; 

for i from 1 to m do 

nsum := nsum + ob10[m-i+1]*10^ (m-i); 
od: 
ob10 := subsop(1 = nsum, ob10); 
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fi: 


if linalg[vectdim] (ob10) = O then 


res := res*x[1]; 

dg := dg+1; 

plist := plist union {0}; 
else 


if (member(ob1i0[1], nlist) = false) and 
(linalg[vectdim] (pe) <> 0) 


then 
plist := plist union fob10[1]}; 
nbi0 := -1; 
cyct := 0; 
while nb10 <> ob10[1] do 
cyct := cyct + 1; 


for i from 1 to linalg[vectdim] (pe) do 
dcycle := peli]; 
clen := linalg[vectdim] (dcycle) ; 
for j from 1 to clen-1 do 
vs i= 
subsop(dcycle[jt1i]= vc[dcycle[j]], vs); 
od; 
vs i= 
subsop(dcycle[1] = vc[dcycle[clen]], vs); 
for k from 1 to n do 
ve := subsop(k = vs[k], vc); 
od: 
od: 
plist := plist union {nb10}; 
if linalg[vectdim] (convert 
([seq(vs[linalg[vectdim] (vs) - ct + 1], 
ct = 1... linalg[vectdim] (vs))], 
base, 2, 10)) > 1 


then 
nsum := 0; 
tmp := convert 


([seq(vs[linalg[vectdim] (vs) - ct + 1], 
ct = 1 .. linalg[vectdim] (vs))], 
base, 2, 10); 
m := linalg[vectdim] (tmp) ; 
for i from 1 to m do 
nsum := nsum + tmp[m-i+1]*10^ (m-i); 
od: 
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nb10 := nsum; 


else 
nb10 := convert 
([seq(vs[linalg[vectdim] (vs) - ct + 1], 
ct = 1... linalg[vectdim] (vs))], 
base, 2, 10)[1]; 
fi: 
od; 
dg := dg + cyct; 
res := res*x[cyct]; 
if cyct > tmax then 
tmax := cyct; 
fi: 
fi; 


fi; 
fi; 
RETURN (res, plist, dg, tmax); 
end: 
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Hints and Solutions to 
Selected Written 


e 
Exercises 
Chapter 1 

4. (12)(34), (13)(24), (14)(23), (123), (132), (124), (142), (134), (143), 
(234), (243), identity. 

6. (12345), (13524), (14253), (15432), (25)(34), (13)(45), (15)(24), 
(12)(35), (14)(23), identity. 

7. Aq and (12)A4. 

9. (a) 5 
(b) 5 
(c) 2 
(d) 6 
(e) 6 

11. Let a be a cyclic generator for G, and suppose j is the smallest positive 
integer for which af € H. Use the fact that Z is a Euclidean domain 
to show that af is a cyclic generator for H. 

13. Example 1.7: An. 
Example 1.8: The set of matrices A with det(A) = 1. 

15. Let a € Sn and b € An, and argue that a~'ba € Ay. 

20. Yes. Use the fact that F[z] is a Euclidean domain. 

23. The primes. 
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25. (a) 


Power Field Element 
x! x 
x? 2x+1 
z’ 2r +2 
xt 2 
x 2x 
zê xr+2 
x a+1 
xe 1 


(c) f(x) = (@+5)(a@+ 7) in 2, [a]. 


29. f(x) is irreducible but not primitive since the order of x is 5; g(x) is 
not irreducible since 1 is a root of g(x), and h(x) is primitive. 


35. (a,b) =274+1,u=24+1,andv=2727+2+1. 


Chapter 2 


2. Use Propositions 2.8 and 2.9 with p = 13 and n = 1. With the 
cyclic generator 2 for Zf;, Proposition 2.8 yields the initial blocks 
Do = {1,3,9} and Dı = {2,6,5}. The parameters for the resulting 
block design are (13, 26, 6,3, 1). 


4. Use Proposition 2.9 with p = n = 5. In this block design, there are 
150 drivers, each car is driven 24 times, and each pair of cars is driven 
by the same driver 3 times. Let x be a cyclic generator for the set of 
nonzero elements in a finite field of order 25, and construct 6 initial 
blocks with 4 elements in each one. For example, the first two initial 
blocks are Do = {x°, x8, x12, 41°} and Di = {z!, 27,213, x19}. 


Chapter 3 


2. The Hadamard code with m = 4 satisfies the stated requirements. 


5. The following generator matrix G and parity check matrix H are one 
of many correct answers. 


1 1 


OERE RER 
= ag 00111111 
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x 

II 
.ernreorrnroD 
oqoooron 
ocoococor:,o 
PRrRrROOrF 
ocoocococor 
coocoraqace 
a a a a a) 
SCrProcaco 
ooroo 


7. rı can be corrected to (11100), r2 can be corrected to (11011), and 
rz cannot be corrected. 


Chapter 4 


1. g(x) = p(x), which yields a [7,4] BCH code. 
2. (a) r can be corrected to (1001110). 


5. Refer to Example 4.3. Note that if we consider only the first four 
powers of a, then g(x) = mı(x)ms(x), which has degree 8. The 
resulting code has 27 = 128 codewords and is 2-error correcting. 

8. (a) r can be corrected to (000111011001010). 

(b) r can be corrected to (111100010011010). 


Chapter 5 


1. (a) g(x) = (x—a)(x—a?)(x—a?)(x— at) = rt +09? +r? +ar +a’ 
(b) The following polynomial is one of the codewords in C. 


(ata +a°)g(x) = afr + afz’ + a?r? +072 +a 


(c) The codeword above converts to the following binary vector. 


(010001000001011011000) 


2. (a) r(x) can be corrected to a°x® + aSa* + ax? + afz + a5. 


(c) r(x) can be corrected to xê + az? + a®x? + aba? + ax +a’. 


4. (a) r(x) can be corrected to a’x!? + ata™ + abr! + aĵx? + 2? + 
az’ + alg8 + a825 + atst + a’? + ax? + afr +a. 
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Chapter 6 


3. The following is the key matrix A for the system. 


5 21 
A=|5 d 


5. (a) “HFXLKQOOFS”. 
(b) “NONETOSEND”. 


8. One possible way to find K is to use the 2 x 1 matrix 
2 
a. 


Bas) [PE 3] 


and the 1 x 2 matrix 


to form the following 3 x 3 involutory matrix K. 


4 1 3 
K = | 20 25 20 
23 25 24 


Chapter 7 


1. (a) The ciphertext is 0 222 222 0 128 175 250 35 118 28 222 
201 99 0 216 175. 


2. The corresponding decryption exponent is b = 41. 
5. 22 total multiplications. 
8. p= 509 and q = 631. 


Chapter 8 


1. y = 16 and z=5. 


4. w=2z+1. 
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7. (a) E= { (1,5), (2,41), (5,5), (7-4), (0,0), (3,0), (8,0), O }. 


(b) E is not cyclic. Theorem 8.3 states then that E is isomorphic to 
Ze x Za. 


8. y = (0,1) and z = (8,2). 
10. w = (7,4). 


Chapter 9 


1. (c) f(£1, £2, £3) = E(x} + 3x12 + 2x3), 4 distinct necklaces. 
d) R? + RPW + RW? + RÈ. 


) 
(a) 

3. (a) 8 distinct necklaces. 
(b) 


b) 2 distinct necklaces. 
6. See Example 9.5 and the results obtained in Section 9.5. 


7. 3984 distinct equivalence classes. 
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